Kbase P7561: How to calculate the best number of records per block for storage areas
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  1/12/2011 |
|
Status: Verified
GOAL:
How to calculate the best number of records per block for storage areas
GOAL:
How to determine the optimal record per block for a specific area
FACT(s) (Environment):
All Supported Operating Systems
Progress 9.x
OpenEdge 10.x
FIX:
Use a recent dbanalys report to determine what is the best blocksize and records per block (records count per block) for your storage areas.
The standard output of a dbanalys looks like this:
RECORD BLOCK SUMMARY FOR AREA "Employee" : 7
-------------------------------------------------------
-Record Size (B)- ---Fragments--- Scatter
Table Records Size Min Max Mean Count Factor Factor
PUB.Benefits 21 848.0B 39 41 40 21 1.0 1.0
PUB.Department 7 211.0B 26 35 30 7 1.0 2.7
PUB.Employee 55 6.2K 99 135 115 55 1.0 0.8
PUB.Family 72 3.1K 38 51 44 72 1.0 1.1
Determine what is an important table for your database and application.
Usually tables which are very large and used frequently should be considered first.
For discussion let's say that Employee is a critical table.
It satisfies our standard criteria that it is big (both in record count and byte count).
The average record size is 115 bytes.
The possible blocksizes Progress can use are 1k, 2k, 4k and 8k.
Most commonly you choose a blocksize which is equal to or a multiple of your file system blocksize.
Defining storage areas and records per block is always a balancing performance and waste.
For example let's say our file system blocksize is 4k (or 4096 bytes). Progress must use a portion of that 4k for internal referencing so the usable space for data is typically 100 bytes less than the block size.
How many 115 byte records could fit in 3996? Answer 34.
Progress will only allow you to set 1, 2, 4, 8, 16, 32, 64, 128, or 256 as a valid number of records per block.
So which should we choose?
If I choose 32 records per block then I would have 32 records each 115 bytes in a 4k block.
That would utilize 3680 bytes (+ 100 for our header information).
Only 316 bytes would be wasted but no record slots would be wasted.
When defining the number of records per block we are stipulating that within each block will live a certain group of records.
Example:
block 5 = contains records 129 to 160
block 6 = contains records 161 to 192
If I use up all the space in block 5 with 1 record then 31 records can't be put in block 1 because there is no space for them.
Record numbers are assigned explicitly to a block upon creation of an area.
The Record Manager controls record placement based on the number of records per block and the available space within that block following a specific defined pattern.
Let's change the example and say our file system blocksize is 8k.
8192 - 100 bytes (our header) = 8092
8092 / 115 = 70 records.
I could use 64 or 128 records per block.
If performance were an issue then I would go with 128 records per block.
I couldn't get all 128 records but each block would hold 70 records (just slightly better than using 64 but in performance critical operations that is still a small saving of time to retrieve the records and memory to hold the records).
I would be wasting 58 record slots, because each record is assigned a specific block to live in and if that record can not fit into the block, it has nowhere to live.
If I choose 64 then I would have 732 bytes wasted but not records slots.
At the present time (9.1x) there is a limit of 2 billion records per area.
At the point in time when you are approaching 2 billion records in one area it is likely to be necessary to create a new area with a lower number of records per block and do a 'table move' of that table into a new area.