Kbase 5527: Explanation of theTime Stamp Changes for Version 6.2F and Above
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  12/22/2006 |
|
Status: Unverified
GOAL:
Explanation of the time stamp / r-code changes that are scheduled to be implemented in version 6.2F.
FIX:
Today (1990) we keep a timestamp in a database schema for each file in the schema. When the file is first created, and whenever we add or delete any index or field in the file, the timestamp is updated. Whenever a program is compiled against a particular database which refers to that file, the timestamp from that file in that database is inserted in the ".r" or object code. When the ".r" is run, the timestamp for each file it refers to must match the schema timestamp for that file in the database being run against. This can cause endless aggravation for users who need to run ".r" object code against many databases having the same structure but which cannot have the same time stamp because they are created and updated at different times. Deployment of updates to an application is more complicated, and when you have multiple databases on a single machine, each must have its own set of ".r"s.
To solve this problem, we will first change the .r structures to
eliminate any information that is unlikely to be identical in two
similar databases (field RECIDs). We will then replace the timestamp
mechanism with one that more directly represents the structurally
important aspects of the schema.
Instead of a timestamp, we will use a Cyclic Redundancy Check (CRC) of
the relevant attributes in the schema to represent the file's current
structure. Whenever the schema is changed, a new CRC for the file will be calculated. If a procedure refers to a file, its CRC will be stored. With this change, the following process can be used to deploy an original application and subsequent updates.
Initial Release of a New Application: (same as now)
1) Develop the application on some database, creating new schema and
".p"s.
2) Use the Progress Data Dictionary to dump a ".df" file from the
development database.
3) Get an empty Progress database and use the Data Dictionary to
load the ".df" into it to create a "basic-db".
4) Save the "basic-db" for use in future releases.
5) Compile all programs (".p"s) on the "basic-db".
6) Ship the "basic-db" with the ".r"s to end-users.
New Release of an Existing Application:
1) Develop the new ".p"s on a copy of the "basic-db".
2) Use the Data Dictionary to produce a delta ".df" which has just the
changes needed to create the new version from the "basic-db".
3) Get a copy of the old "basic-db" and use the Data Dictionary to
load the delta ".df" into it to create a "new-basic-db".
4) Save the "new-basic-db" for use in future releases.
5) Compile any ".p"s referring to altered schema.
6) Ship new ".r"s and the delta ".df" to end-users.
------------------------------------------------------------------
2. Important Issues
2.1 Application Demonstration Copies
When we changed our Test Drive (demo) strategy to "n runs" in a
database, we took care to ensure that it would allow VARs to ship demo
copies of their applications while protecting their proprietary R-code
from someone who buys a full copy of Progress. We made sure that full
PROGRESS would not operate on a Test Drive database, and since the
R-code shipped by the VAR was tied through the timestamp to the demo
database, no one could get a full database and be able to run the
R-code against it. With the CRC approach, the full and demo databases
built by the VAR will look identical, so demo R-code will work with
the full database.
To continue to support this method of delivering low cost evaluation
copies, we will include in the CRC a different seed depending on
whether the PROGRESS module is full or Test Drive. The only exception
will be that the metaschema files and fast-track files will have the
same CRC whether or not they are in a Test Drive database.
2.2 Security
The security fields in t.he schema (_Can-xxx) do not require a change
to the timestamp when they are modified. This is necessary to let a
VAR package and deploy R-code before the end user site has
established its security constraints. Insofar as the VAR wants to
build security into the system, each program can do whatever
run-time checks are deemed necessary. The _Can fields prevent an
end-user from writing and compiling new programs that might destroy
the integrity of the system or violate sensitive information.
With the CRC scheme, however, a user could construct a counterfeit
database without security constraints, compile a program that does
anything, and run it against the production database. With the
timestamp scheme, you could never get the timestamps in the
counterfeit database to match, so the precompiled program would
not run against the production database.
Assuming that users care about this potential security breach,
we will allow the _Db-revision field for the anonymous _Db record
(ie the _Db record for the PROGRESS local database itself, as
opposed to gateway _Db records) to be used as a way to associate
.r's with a given database and prevent counterfeit .r's from being
run on the database. A proutil option will be added to let a user
set the _Db-revision field by supplying a key which will be one-way
encrypted. This _Db field is already being stored in .r's and
being checked against the corresponding _Db records when .r's are
run. Once a _Db-revision key code has been assigned to a database,
you cannot change it unless you know the current key. Also, we will
protect the _Db-revision field for the local database by not
allowing any update of it through PROGRESS itself, but only through
proutil by way of the key. Another proutil option will be added to
let a user set the same key into existing .r's.
2.3 R-code Size
The r-code will grow slightly for any cases where there is a PROMPT-
FOR that references fields in a file which has not been referenced in
any other way anywhere in the program. (The name of that file will
have to be added to the r-code to allow it to be found at runtime,
since the file itself will not be scoped and there will be no other
mechanism to identify it).
2.4 Foreign Databases
Foreign dbims need to check their own fields? That is, we should not
be checking any xxxMISCn fields. Also, we will not check any reserved
fields--xxxRESn.
2.5 Components of the CRC Calculation
FOR FILE: _File IT USES: _File-name _DB-lang
AND IGNORES: _CANXX xxMISC xxRES _Prime-Index _File-Number _dft-pk
_numkey _numkcomp _numkfld _Template _numfld _Desc
_DB-recid _Valexp _Valmsg _Last-change _Hidden _Frozen
_Dump-name _CRC
FOR FILE: _Field IT USES: _Field-Name _Data-Type _sys-field
_field-rpos _Decimals _Order _Fld-stdtype _Fld-stlen
_Fld-stoff _Fld-case
AND IGNORES:
_CANxx xxMISC xxRES _File-recid _Initial _Label _Mandatory
_Format _Valmsg _Help _Desc _Col-label
FOR FILE: _Index IT USES: _Index-Name _Unique _num-comp _idx-num
AND IGNORES: xxMISC xxRES _File-recid _Active
FOR FILE: _Index-Field IT USES: _Field-Name _Ascending _Abbreviate
_Unsorted
AND IGNORES: xxMISC xxRES _Index-Seq _Index-recid _Field-recid
2.6 Philosophy of R-code/Schema Comparison Mechanism
The R-code/Schema comparison mechanism is intended to prevent the
execution of out of date R-code. The fundamental question that
needs to be defined is what "out of date" means. There are (at least)
3 possible definitions or goals of this mechanism.
1. Minimal System Integrity
This approach only guarantees that the .r will not crash
or corrupt a Database attempting to execute a .r. This is the
unstated philosophy underlying our current timestamp mechanism.
Only schema changes that could result in serious internal
inconsistencies force the timestamp to change.
2. Can Recompile
This approach attempts to invalidate a .r un.less the original .p
could be recompiled with no problem. For example, in version 5,
changing a field's name does not change the timestamp. But an
unchanged .p that refers to the old field name is unlikely
to recompile successfully.
3. Guaranteed Functionality
This approach attempts to guarantee that the original .p, if
recompiled, would function identically to the existing .r. For
example, changing the validation expression or message does not
invalidate the timestamp today. So by running an old .r, a user
may be putting invalid data into the database.
Though our current scheme can be argued to provide more flexibility
by letting the user run more r-code after a schema change without
recompiling, it may be more confusing when some later recompilation
fails or when data is discovered that fails to match the current
validation constraints.
3. Compatibility Considerations
This would require a change to the r-code version. The old version 6.2
code would be supported by the new version, but not vice-versa. The new version of PROGRESS would use field dbkeys in the r-code if the version number were old, but would interpret the dbkey as file/field
identifiers if the version number were new. Renaming a field, file or
index may cause ".r"s to fail, where they would not have in the past.
Of concern is the danger that a VAR ships new R-code to users still
running 6.2A. Even within a single organization, if an application
started with 6.2A, how will we transition to 6.2F? To aid with this
transition we will supply a command line option (-crc) which will
enable version 6.2F+ to produce new R-code. Version 6.2F+ will
continue to support the old style R-code on either new or old database
types. It will create old R-code unless the -crc option is given.
If -crc is given and the CRC fields are not initialized, the user
must run a utility that initializes the CRC mechanism for the specific
database (see the PROUTIL commands in the next section). We already
have an _CRC field in the _File file with a value of unknown. The
utility will calculate the CRC for each file. Also the empty database
will have the CRC values starting with version 6.2F, and the conv56
and convft utilities will also calculate these values.
4. Performance Impact - Probably not much.
5. Memory Impact - None? -- perhaps the new code will be slightly
larger.
Online Procedures or Utilities:
Proutil will be changed, adding three new command line options. The
commands are used to update or initialize databases and R-code files
to use the _DB-revision (a.k.a. datestamp authorization key) field of
the database and the database file CRCs to validate R-code running
against a database. The authorization keys will be entered by users
as an arbitrary string of characters on the command line. They will
be hashed and stored internally as 32-bit integers. There will be
no way to extract the original (printable) key value once it has been
set.
The new proutil commands are:
dbauthkey
Usage: proutil databasename -C dbauthkey oldkey newkey
This option sets the _DB-revision field for the named database
based on the newkey argument. If the database has not been
initialized with an authorization key, then the oldkey is
ignored. Otherwise, the oldkey is compared with the key already
in the database, and the two must match, otherwise proutil
aborts. If the keys do match, then the authorization key for the
database is set to newkey.
dbinitcrc
Usage: proutil databasename -C dbinitcrc
This causes the internal CRCs for the files in the schema for
the named database to be computed and stored in the database
file records.
rcodekey
Usage: proutil databasename -C rcodekey oldkey newkey files...
This works in a manner similar to the dbauthkey command, except
that it scans the named files (which have to be R-code) and.
updates the authorization key for the named database in the
R-code with the newkey. Like the dbauthkey command, the oldkey
must match any existing key in the R-code before the newkey is
applied..