Kbase 2879: proutil dbanalys report Block allocation with dump and load
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  10/05/1998 |
|
proutil dbanalys report Block allocation with dump and load
900810-sal01
INTRODUCTION:
=============
Running Progress Database Analysis with proutil dbanalys is a valuable
tool to use to analyze the status of your database. It takes a
snapshot of the database and checks how well data are distributed
with the database and inspects the free chains of both the index
and database blocks.
WHY YOU NEED TO DO THIS:
========================
To evaluate space usage and check for potential fragmentation within
the database, indicating how well or poorly space is being utilized.
WHAT THE DATABASE MANAGER IS DOING INTERNALLY
=============================================
Some facts about record space allocation:
Database blocks are used to store records that have a variable size
block header, that is, 20 + (2 * number of records in block) long.
Records (and the fields they contain) are stored variable length.
The length is determined by the amount of space required to
store the values contained in each field, plus a small amount
of overhead for lengths, etc.
Up to 32 records can be stored in a single database block,
if they fit. If a record is larger than what can fit in a
block, PROGRESS splits it across multiple blocks, making the
fragments as large as possible.
When PROGRESS allocates space for records in database blocks,
it tries to leave 75 bytes free in each *block* for expansion
space, in case the record is updated in the future.
This is done to reduce the number of records that are split
across multiple database blocks.
Partially filled record blocks are kept on a chain called the
"RM (Record Manager) chain". After a record is created,
updated or deleted, if the block is not already on the
RM chain, and has at least 250 bytes free (150 for systems
with 512 byte blocks), PROGRESS puts the block on the chain.
This is done so that it can find and use partially filled blocks
later.
In all current releases, the database block size is fixed when
Development compiles PROGRESS for a particular system.
The sizes currently used are:
Block Space for
Size Expansion
(bytes) (%) Operating system
512 15 DOS OS/2, NLM
2048 4 Pyramid, VAX Ultrix, Sequent, VMS, Nixdorf
1024 7 All others
PROCEDURAL APPROACH:
====================
Run proutil dbname -C dbanalys > output.file. This will run the report
to an output file that can be read; the default (without a ">"
redirect) will send the output to the terminal.
HOW TO READ THE DB ANALYSIS:
===============================
The report is broken into 3 catagories:
1. Free Chain Analysis
======================
"<n> block(s) found in the free chain"
This is the number of completely empty blocks that are available
for use. They are listed in the free chain and are available for
any usage.
2. RM Chain Analysis
====================
"LIST OF RM CHAIN BLOCKS"
These are the blocks that may (not guaranteed)
have free space that is available to be used
to add new records. Rather than read every RM (Record Manager)
block to find one with free space, PROGRESS builds what is called
an "RM Chain Block", that lists all the blocks with available
space. Each block in the chain points to the dbkey of the next
block.
The listing gives first the dbkey that is available, second
the amount of free space in bytes, third the number of
record slots in the block that are free, followed by the number
of slots being used.
As mentioned above, in most PROGRESS databases,
the size of each block is .5K (512 bytes), 1K (1024 bytes),
or 2K (2048 bytes). The dbkey is the record id,
a 32-bit binary number that contains 27 bits for the "Block
Number" and 5 bits for the "Slot Number". Each block has 32
possible slots, therefore 32 possible records can be in each
block. The dbkey will never change unless you dump and reload the
database. It is also possible for a record to be spread across
multiple blocks.
At the end of the chain list is the total amount of blocks in the
RM free chain. The chain should be checked to be sure that the
chain is not "clogged" with small size entries at the beginning.
This can cause inefficient use of space, as a free (empty) block
would be used rather than existing blocks from the RM chain list.
Should this be the case, there are a couple of options that you
have:
1. Dump and reload the database to reformat the space and
reallocate free space. This may not always be the best choice --
one would have to check how well the database is allocated by
checking the RM block space utilization percentage. (see the
next section).
2. Try adding a number of smaller dummy records to remove the
entries from the beginning of the chain. This sometimes
works but in no way guarantees that the chain will not
clog again later.
3. BLOCK ANALYSIS
=================
"<n> RM blocks that are candidates for the RM free chain"
(Version 5 and earlier)
These are blocks used for the free chain. The number is
irrelevant and has been removed in Version 6.
"<n> RM block(s) found in the database"
"<n>% of the RM block space is used"
-----------------------------------------
This is the total number of blocks in the Record Manager
portion (data) of the database.
The block size times the number of blocks will give
the actual data size. The percentage should be between 70% and 90%.
If it is below 70% then the database is a candidate for a dump and
reload. If the size is over 90% then the database space is being
allocated very well.
This fiqure must also be interpreted within the context of one's
own database. If the database is very static, a high RM % is what
you would like: a lot of data stored in the smallest area.
In a dynamic situation with a lot of updates and additions, this
may not be the best scenario, in that it could lead to the
database being fragmented.
"<n> index block(s) found in the database"
"<n>% of the index block space is used"
------------------------------------------
This is the total number of Index Blocks (Indexes) of the
database. The block size times the number of blocks will give
the actual index size. On average, the index blocks are 60% to 75%
full.
Index Block Summary
-------------------
Index name - The index name
Index - This is _Index._index-num
Blocks - The number of blocks for that index
Bytes - The number of bytes used by that index
%Utiliz - The percentage of space used within the blocks
allocated to that index. Should this percentage
be less than 70% and the index is large,
it would be an indication that the index should
be rebuilt.
% of all Indexes - This is the size of the index (%) as related to
all the indexes. The larger the percentage,
the greater impact of an index rebuild on space.
4. Summary
==========
1. Rebuild any large indexes under 70% utilization.
2. If RM block space is less than 70% utilization or exceeds 90%,
then the database may be a candidate for a dump and reload to re-
allocate space. Again, these numbers are a guideline only.
Whether it should be dumped or left alone depends on how your
data are used.
3. Check the RM chain list to make sure that the chain is not being
clogged with small byte size records at the top of chain. A sign of a
clogged chain is when the database appears to be growing when there
have been numerous deletes. In version 4, the first record is
important; in version 5, the first five records are important.
In versions 6 and 7, PROGRESS uses better algorithms
in determining RM free chain utilization.
You can try the two options listed under the RM Chain Analysis section
to free the chain if it is clogged.
Again, this will only fix the current situation and will
not prevent further clogging. We continue with each version to try to
improve the usage of RM chains, weighing performance against database
space utiliziation.
4. Every Database is different. This has to be taken into account
when evaluating any numbers that the Dbanalys should return. Be sure
to understand your own database before making any decisions to dump
and reload. Evaluate whether it is static or dynamic. Also note
that in some cases where a database is spread across multiple
disks, you may have better performance leaving it, because of the
randomness in the physical distribution of the data.
REFERENCES TO WRITTEN DOCUMENTATION:
====================================
SYSTEM ADMINISTRATION GUIDE
Progress Software Technical Support Note # 2879