Consultor Eletrônico



Kbase P117480: How to dump and load a large table larger than 2 GB
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   17/04/2009
Status: Verified

GOAL:

How to dump and load a large table larger than 2 GB

GOAL:

How to ASCII dump tables > 2 GB

GOAL:

How to ASCII load tables > 2GB

GOAL:

How to handle tables 2GB in the data dictionary

GOAL:

Data Admin fails to load file greater than 2 Gb

GOAL:

Cannot ascii load .d greater than 2GB

GOAL:

Why does table fail to load with Array subscript 11 out of range (26) ?

FACT(s) (Environment):

All Supported Operating Systems
Progress 7.x
Progress 8.x
Progress 9.x
OpenEdge 10.0x
OpenEdge 10.1A
OpenEdge 10.1B

FIX:

The Data Dictionary load or Database Administration Load utilities are essentially 4GL utilities. So, on a file system without 2GB limits or user limits/quotas, the 4GL will be able to create a dump file larger than 2Gb: as the output is just appended to, leaving the environment to control the file limit.

When the resulting ASCII dump succeeds in creating a dump file GT 2 GB, prior to OpenEdge 10.1C the limitation on the product is that the trailer information (even if manually entered) cannot be read while loading the resulting GT 2 GB dump file as the lseek position is still 32-bit limited and therefore fails. Typically with error messages:
** Array subscript 11 out of range (26)
** Unable to update Field (142)

There are times when it is necessary to ascii dump out tables, particularly when corruption are suspected. When these tables exceed 2GB in size, the product limitations prior to OpenEdge 10.1C prevent either and or 2GB dump loads depending on the Progress version and operating system.

A workaround therefore is to dump those tables exceeding 2GB into multiple ascii dump files and then bulkload and idxbuild as follows:

Note: the following code will either have to be compiled to run against a run-time license on client site, or a temporary compile time license put in place for the duration of the exercise.

0.) Valid database backup and truncate bi file.
1.) On source database, run tabanalys to find which tables are 2GB or larger and their respective record sizes.
2.) Use the following code to dump the tables in question:

/* <asciidump_table.p> */
/* Replace 'customer' with the tablename in question.
# by MODULO = # records, in this example, a 'worst conservative case' has been assumed:
# 2Gb / 32KB records = ROUNDED(65000)
# the footer needs to be changed to the database's requirements for 'numformat', 'dateformat' and 'cpstream'
# the resulting dump files will be named <tableName><n>.d by MODULO increments
*/

&GLOBAL-DEFINE tableName customer
DEFINE VARIABLE i AS INTEGER NO-UNDO.
DEFINE VARIABLE j AS INTEGER NO-UNDO.
DEFINE STREAM data1.
DEFINE STREAM data2.
ASSIGN j = 1.
PAUSE 0 before-HIDE.
OUTPUT STREAM data1 TO VALUE("{&tableName}" + STRING(j) + ".d").
OUTPUT STREAM data2 TO "{&tableName}.txt" UNBUFFERED.
/* FOR EACH {&tableName} BY ROWID({&tableName}): */ /*only if index corruption suspected */
FOR EACH {&tableName}: /* this will dump by the primary index, specify BY <indexname> if another is required */
EXPORT STREAM data1 {&tableName}.
SEEK STREAM data2 TO 0.

EXPORT STREAM data2 STRING(ROWID({&tableName})) FORMAT "x(20)".
i = i + 1.

IF i MODULO 65000 = 0 THEN DO:
j = j + 1.
i = 0.

&nb.sp; PUT STREAM data1 UNFORM SKIP
'.' SKIP
'PSC' SKIP
'numformat=44,46' SKIP
'dateformat=dmy-1950' SKIP
'cpstream=ISO8859-1' SKIP
'.' SKIP.

OUTPUT STREAM data1 CLOSE.
OUTPUT STREAM data1 TO VALUE("{&tableName}" + STRING(j) + ".d").
END.
END.
/* This is only needed for the last dump file which doesn't meet the MODULO condition */
PUT STREAM data1 UNFORM SKIP
'.' SKIP
'PSC' SKIP
'numformat=44,46' SKIP
'dateformat=dmy-1950' SKIP
'cpstream=ISO8859-1' SKIP
'.' SKIP.

OUTPUT STREAM data1 CLOSE.
OUTPUT STREAM data2 CLOSE.
/* </asciidump_table.p> */

3a.) To load into the target database, either load each <tableName><n>.d one at a time through the data dictionary, which builds the indexes as each record is loaded or

3b.) Against the source database, create a bulk load discriptor file for the tables concerned (dbname.fd)
And then edit the descriptor file to load each <tableName><n>.d as follows:
$ proutil dbname -C bulkload dbname.fd

after running the bulkload, run index build on these tables:
$ proutil dbname -C idxbuild [+ parameters ]

For example: dbname.fd

Customer customer1.d customer1.e
CustNum
Country
Name
Address
Address2
City
State
PostalCode
Contact
Phone
SalesRep
CreditLimit
Balance
Terms
Discount
Comments
Fax
EmailAddress
.
Customer customer<n>.d customer<n>.e
CustNum
Country
Name
Address
Address2
City
State
PostalCode
Contact
Phone
SalesRep
CreditLimit
Balance
Terms
Discount
Comments
Fax
EmailAddress
.
# end-of-file.