Consultor Eletrônico



Kbase 20245: Cyclic Redundancy Check Explained
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   30/07/2007
Status: Verified

GOAL:

Cyclic Redundancy Check Explained

GOAL:

How does CRC work within different versions of Progress?

GOAL:

** CRC for <filn> does not match CRC in <procedure>. Try recompiling. (1896)

GOAL:

If dumping and loading database, is a recompilation of programs needed?

GOAL:

Deployment with the Three Database Rule

FACT(s) (Environment):

Progress 7.x
Progress 8.x
Progress 9.x
OpenEdge 10.x

FIX:

Starting with Version 6.3F, Progress incorporated a powerful feature to support the deployment of object code, the Cyclic Redundancy Check (CRC). With CRC, the same object code can run against different database tables that have identical structures, even when these tables are created at different times and with different methods.

Each table in a Progress database has it's own CRC number that is stored in the metaschema field _File._CRC.

Many Progress developers base their entire deployment strategy on the CRC feature. However, after a migration you can get the following error:

1896 ** CRC for <filn> does not match CRC in
<procedure>. Try recompiling. (1896).

A subsequent incremental dump does not reveal any difference between the Development database and the Production database. How can you get this error if the schema are identical?

The answer is in the word "identical". Unfortunately, the CRC calculation for a file contains components that do not show up in an incremental .df file.

Some of these components (for example, _Data-Type, _Extent, and _Unique) are fundamental user-defined properties of the database and certainly affect its "structure".

Others (for example, _order and _Decimals) can be changed at any time and appear to be much less fundamental.

Still others, (for example, _field-rpos) remain completely hidden to the programmer but are important to Progress internally. It is this last category that is responsible for the so-called "CRC issues". The following table lists the different CRC issues by Progress Version.

History of CRC issues in Progress:

Version Number Feature CRC issue
=========......========.................=========
6.3F CRCs introduced * _File._File-number
* _Index._idx-num
*_Field._Field-Rpos
*byte swapping

7 Sequences Introduced * _Field._Field-Rpos
&nbsp.; *Sequence CRC

8.1 Sequence issue resolved * _Field._Field-Rpos

9 Index CRC Introduced * _Field._Field-Rpos
*Index CRC

NOTE: There are also CRCs for schema trigger procedures, but these serve a completely different purpose and are not responsible for any of the problems discussed here.


Discussion of the _Field._field-rpos issue:

There are two reasons why the example of the _field-rpos field of the _Field table is mentioned as being important to Progress internally:

- Every Progress application database contains fields

- Every Progress version is affected by this issue

The following table lists the schema fields that are involved in the CRC calculation for files. (This information is taken from Appendix A of the Progress Version 7.x Programming Handbook):

Fields that take part in the CRC Calculation for Files:

_File _Field _Index _Index-Field
====== ======= ======= =========
_Field-name _Field-Name _Index-Name _Field-Name(1*)
_DB-lang _Data-Type _Unique _Ascending
_dtype _num-comp _Abbreviate
_sys-field _idx-num(2*) _Unsorted
_field-rpos
_Decimals
_Order
. _Extent
_Fld-stdtype
_Fld-stlen
_Fld-stoff
_Fld-case

(1*) The _Field-Name for index fields is referenced in the _Field table using _Field-recid in _Index-Field.

(2*) This component was used prior to Version 7.x.

All of the listed schema fields participate in the CRC calculation for each database table. A change to any of these fields automatically changes the value of _File._CRC for that table (There is also a CRC value for all sequences in the database but more about that later).

As mentioned above, not all of these schema fields can be modified by the developer. The most notorious schema field in this respect is the _field-rpos (Field Record Position) field.

_Field._field-rpos is the field's physical position within a record. This value is automatically supplied by Progress when new fields are created. Freed-up field positions are reused, but it is also possible to create gaps by deleting fields (_field-rpos has nothing to do with the _order field that is for default display order).

The following examples demonstrate some of the _field-Rpos' pitfalls.

Example 1(Prior to Version 9.x):
Create a new Database Sports1 from the Sports Database and make it the working Database. Run procedure rpos.p against it:

/* rpos.p */

output to rpos.txt.
find _file where _file._file-name = "Salesrep" no-lock.

put unformatted ldbname("dictdb") + "." + _File-name + " (CRC " + string(_File._crc) + ")" skip.

for each _field of _file by _field-rpos:
display _field-name format "x(18)" _field-rpos with stream-io.
end.

Sports1.Salesrep (CRC 43047)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name........................2
Region..........................3
Sales-Rep.......................4
Month-Quota.....................5

The output of rpos.p shows that the highest value for _field-rpos is 5 (_field-rpos values always start at 2 because Progress reserves the first field record position to store the table number).

Now add two fields to the Salesrep table: field1 and field2. Leave all other field attributes at their default values. Commit the Database transaction. If you run rpos.p again, you see _field-rpos values 6 and 7 are assigned to the new fields, respectively:

Sports1.Salesrep (CRC 8331)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name........................2
Region..........................3
Sales-Rep.......................4
Month-Quota.....................5
field1..........................6
field2..........................7

Now delete field1. Commit the transaction. If you run rpos.p again, you see that you have created a new gap in the _field-rpos sequence:

Sports1.Salesrep (CRC 36130)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name........................2
Region..........................3.
Sales-Rep.......................4
Month-Quota.....................5
field2..........................7

Now create a second Database (Sports2) from Sports. Create an incremental df file from Sports1 to Sports2.

The resulting delta.df file contains an ADD FIELD instruction for field2 only.

NOTE: The _field-rpos attribute is not mentioned in the delta.df (this is the source of the troubles).

Apply delta.df file to Sports2 and run rpos.p against Sports2:

Sports2.Salesrep (CRC 52448)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name.........................2
Region...........................3
Sales-Rep........................4
Month-Quota......................5
field2...........................6

Both schemas appear to be identical in the Data Dictionary and an incremental dump between Sports1 and Sports2 does not reveal any differences.

The CRCs are different. Sports2.Salesrep.field2 has received a _field-rpos value of 6 where as Sports1.Salesrep.field2 still has _field-rpos value of 7.

Example 2 (This example shows that even _field-rpos numbers in the middle of a table are reused):

Delete Salesrep.Region (_field-rpos = 3) in the Sports1 Database. Commit the transaction and run rpos.p against Sports1.

There is another gap in the _field-rpos sequence:

Sports1.Salesrep (CRC 7618)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name.........................2
Sales-Rep........................4
Month-Quota......................5
field2...........................7

Recreate the Sports2 Database from Sports again. Create an incremental dump file delta2.df from Sports1 to Sports2.

Progress asks if the Region field was renamed or deleted. Make your reply deleted because you do not want to put the data of the old Salesrep.Region field in the new field2.

The delta2.df file now contains an ADD FIELD instruction for field2 and a DROP FIELD instruction for Region. Apply delta2.df to Sports2 and run rpos.p against it:

Sports2.Salesrep (CRC 23241)

Field-Name............._field-rpos
----------------------.-----------
Rep-Name.........................2
field2...........................3
Sales-Rep........................4
Month-Quota......................5

An unexpected outcome! A value of 3 is assigned to field2. The value of 3 has been reused.

The above experiment can only lead to one conclusion:

It is virtually impossible to keep two Database's CRC compatible if you make manual changes in one of them.

Don't make any manual changes.

The three database system before Version 9:

To not make manual changes might sound impractical at first, but in fact manual changes can be accomplished with the three database system.

This system introduces a third database in-between the Development and Production database. This database is referred to as Basic database.

The Basic database does not contain data but has the same CRCs as the Production Database. Make the database changes against the Basic Database using the logical name of the Production Database. Now the .df file can be distributed together with the new objects.

When the same df file is applied to the Production database(s), you are guaranteed to get the same CRC(s) as the Basic database, and the compiled code will run against it.

In summary, follow these steps:

1) Make your manual changes in the Development database.

2) Create an incremental .df file from the Development database to th.e Basic database.

3) Apply the delta.df file to the Basic database.

4) Use the -ld parameter to connect the Basic database with the logical name of the Production database.

5) Compile all affected source code against the Basic database.

6) Deploy the new object together with the delta.df file to the Production database.

7) Apply the delta.df file to the Production database.

8) The object runs against the Production database.

Note that the Development database can be changed freely as developers add new elements. During the development stage, new elements may even be deleted before they ever make it to a field.

Every time a .df is produced from the Development database and applied to the Basic database, it is also applied to all Production database(s), and new compiled code for all affected programs is distributed to all Production sites as
well.

Affected Programs include all those who use any objects whose CRCs have been changed.

Additional information for Progress Versions 8.x and 9.x:

- Versions earlier than 9.x included indices in the CRC calculation.

- Index changed the table CRC.

- Applications that use the modified table do not run without recompiling, even if they did not use the index.

- Version 9.x separated tables and index CRCs.

- Now, only applications that use a changed index must be recompiled.

Impact of index changes on CRC values:

In earlier versions of Progress, indexes were included in the calculation of CRC values for a table. Whenever an index was added or deleted, the CRC value for the table changed. Without recompiling, an application that referenced the table, even if it did not use the index, would not execute.

To rebuild an index after it has been changed required the database to be off-line. The length of time the database was off-line depended on the which approach you used:

The approach that involved the least down time required the developer to make the index inactive:

The state of the index does not affect the CRC value. The index is rebuilt in the background while the database is on line, using PROUTIL.

However, the database must be taken off-line in order to make the index active. While the index is inactive, an application that references that index will not
execute, even if the application has a matching CRC value.

This reduces flexibility in terms of determining when to take the database off-line and activate the indices.

The Version 9.x solution:

In Version 9.x, indices are removed from the table's CRC calculation and given their own CRC values.

Once the applications are converted to Version 9.x and the database is migrated to Version 9.x, a change to the indices does not necessitate a recompilation of
applications because the table's CRC value is unchanged.

You can add temporary and intermittently active indices without changing the CRC and therefore without impact to applications that do not reference the indices.

At run time, the application index CRC values for the indices it references are compared to the index CRCs in the database. If there is a mismatch, the application does not run and an error is issued.

Fast schema update:

Version 8.2A introduced the Fast Schema Update for added fields. Fast Schema Update appends new fields to the end of the record, provided they do not have an initial value. This is done at run time as records are accessed by users, rather than in the off-line batch mode.

Each time a record is accessed, it is scanned to count the number of fields to verify that it has the correct number. If the field count is not correct for the current version of the database, the database manager appends the missing fields to the end. of the record.

Version 9.x introduces two changes to the schema update process:

- Fast Schema Update is extended to deleted fields.

When a record is accessed at run time, a deleted field marker replaces the deleted field.

- Schema versioning reduces the amount of time spent
scanning records..