Kbase P65028: Time to apply AI file takes longer and longer over time
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  08/05/2009 |
|
Status: Verified
SYMPTOM(s):
The BI of hot spare (standby) is abnormally large compare to production database
Abnormal BI growth on roll forward
Growing Before Image (BI) file on standby database
rfutil -C roll forward -a <ai_name> NOT being run in batches of ai files
Taking a long time to apply AI file on hotspare database
Hotspare database falling behind the live database because rollforward takes longer to complete than the ai switch time
AI files switched at 10 minute intervals
Ai files take longer than 10 minutes to rollforward
BI file on the hotspare database grows on each roll-forward
Bottleneck is in the time that it takes to complete the Physical Redo Phase (7161)
Resource contention on the hotspare hosted server ruled out
Hotspare database rollforward log file does not show attempts to rollforward retry
ai blocksize matches bi blocksize
bi cluster size the same on live and hotspare databases
Database startup parameters -G 60 -Mf 3.
FACT(s) (Environment):
Progress 9.1x
All Supported Operating Systems
OpenEdge 10.x
Progress 9.x
Progress 8.x
CAUSE:
Bug# OE00112670
CAUSE:
Bug# OE00148288
CAUSE:
The behavior is a function of the design of the redo of the BKMAKE note. When we start redo for the most part, we only need to go back 1 cluster to start. If we encounter BKMAKE notes they are "redone" regardless of their update counter value. Any subsequent block changes are also redone AND the timestamp on the bi cluster is reset to reflect the action in case of a crash. The BKMAKE notes are a result of a database extension or a movement of the highwater mark. AS each ai file is applied after this point redo restarts in this cluster and redo grows longer until the cluster is eventually marked as closed. Roll forward is more prone to this issue depending on the ai files applied and when they eventually cause those re-opened clusters to close. Replication is less prone to this issue because the replication target does not need to be shutdown and thus go through redo.
FIX:
Upgrade to OpenEdge 10.1C, where code changes were implemented to improve the REDO processing.
Whether upgrading is possible or not, re-baseline the hotspare database as follows:
1.) Take an online probkup of the live database, (live.bak) This will cause an ai switch, so be sure that there is a new ai note available to switch into.
2.) Note in the live database.lg the entry:
"Switched to ai extent 99999. (3777)"
3.) prostrct create hotspare -blocksize <same_as_live.db> using same extent structure as live database
4.) prorest live.bak hotspare
5.) Begin ai rollforward from the note listed in step 2
Consider using FIXED ai extents and ensure that none of the database extents are growing into the variable extent (ie last extent)