Consultor Eletrônico



Kbase 15479: How to Dump and Load a Database the Fastest Way?
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   09/09/2008
Status: Verified

GOAL:

How to Dump and Load a Database the Fastest Way?

GOAL:

How to Perform a Binary Dump and Load?

GOAL:

How to Reduce the Index Scatter Factor?

GOAL:

How to Reduce the Record Scatter Factor?

GOAL:

How to Reduce Database Fragmentation?

GOAL:

Are Binary Dumps OS platform independent?

GOAL:

Are Binary Dumps portable between platforms?

FACT(s) (Environment):

Progress 8.x
Progress 9.x
OpenEdge 10.x

FIX:

Binary dump is the fastest way to dump a database. Binary dump files are portable across all platforms, regardless of CPU type.

Follow these steps to perform a binary dump:

1) Start a server on the database to be dumped.

Presumably, since fewer users are on the system, more memory can be allocated to the shared memory pool with the (-B) startup parameter.

For example:

$proserve sports -B 100000


2) Start multiple dump sessions.

Since these are I/O intensive, 3 to 4 sessions per CPU is recommended. Some improvement might be obtained by dumping the data to different disks, but this requires more complicated dump and load scripts.

For example:
$proutil sports -C dump customer /disk1/temp/data
$proutil sports -C dump invoice /disk2/temp/data

The smallest records should be loaded first. You can use PROUTIL <db> -C DBANALYS to determine the table(s) with the Smallest records. This strategy reduces scatter because Progress loads as many records as it can into a given block before it moves on to the next. This uses up the slots in the block (32 for 1K, 2K and 4K, or 64 for 8K).

If larger records are loaded first, the database might contain blocks with enough space for smaller records. This leads to fragmentation because the small records are scattered throughout the database. Also there should be one binary load per storage area in order to not cause fragmentation during the load.

For example:

|--------------------|
| Extra space | <-- A couple of small records could fit here, but
|-----Record 3 ------| there's not enough room for another large record.
| |
|--------------------|
| Record 2 |
|--------------------|
|--------------------|
| Record 1 |
|--------------------|
Small records loaded first:

|-------------------|
| Extra space | <-- No room for a large record, so there will be
| Rec 15 | Rec 16 | less scatter.
| Rec 13 | Rec 14 |
| Rec 11 | Rec 12 |
| Rec 9 | Rec 10 |
| Rec 7 | Rec 8 |
| Rec 5 | Rec 6 |
| Rec 3 | Rec 4 |
| Rec 1 | Rec 2 |
|-------------------|
Background processes: use a Before-Image Writer (BIW) and 2- 4 Asynchronous Page Writers (APW's). The best database block size is 8K.


Fastest way to rebuild the indexes of a Version 8.x database:
With either bulkload or binary load, you must rebuild the indexes after you load the data. If your database is over 2 GB in size, you should use multiple sort files.

Usually the sort space required for a database is around 75% of the database size but your requirements might vary.
The index rebuild command is:
proutil <dbname> -C idxbuild all -B 1024 -TB 31 -TM 32


Index rebuild and Version 9.1:
In Version 9.1, Progress introduces the option to rebuild the index structure during the load phase.

In most cases Binary Dump and Load is the fastest method of dumping data from a Progress / OpenEdge Database.
This is more likely to be true for large volumes of data.

For tables with very few records it m.ay be quicker to dump data from the Progress / OpenEdge Data Dictionary or programmatically since the action of using the binary dump has some inherent overhead when making the header of each binary file.
The overhead of the binary dump can actually cost proportionally more time than would be saved on tables with very low record counts. .