Kbase P112594: Clock backwards (8896) breaks Fathom Replication source (861) or (766)
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  2/4/2010 |
|
Status: Verified
SYMPTOM(s):
Clock backwards (8896) breaks source db (861) or (766)
source and target databases on different servers
Changing system time backwards on source server sometimes causes source database to not restart
Cluster aging has detected that the clock has gone backwards by seconds. (8896)
After message 8896 source database continues running
After message 8896 RPLS and RPLA continue processing
source database is shut down with open transaction(s) still running after time switch
on restart of source database, during roll back recovery either error 861 or 766
SYSTEM ERROR: rlmemchk mb_aictr: note=<number> mstrblk=<number>. (861)
SYSTEM ERROR: Unexpected extent switch note (766)
source database is not recoverable after backwards time change
Not reproducable on non replication enabled database
system time change forwards does not cause failure
Stack trace from KERNEL32.dll reads
dsmFatalMsgnCallBack
rlairedo
dorollf
rlrollf
warmstrt
FACT(s) (Environment):
Fathom Replication
Windows
All Supported Operating Systems
OpenEdge Replication
CAUSE:
Bug# OE00125896
FIX:
If (say) Windows Time (w32time) service is running, then the system time change may automatically have occurred, in which case the steps outlined below are the only workaround. If the system time between the source and target database servers are noted, the best course of action is as follows:
1.) if possible, ask users on the source database to complete their work then shut the source and target databases down gracefully:
$ proshut [source | target] -by
2.) change the system time back
3.) restart the source and target databases
$ proserve [source | target] [.. startup parameters .. ] -DBService [replserv | replagent]
If the system time change has occurred while the databases were online:
One option would be to force into the source database, break replication, dump and load the source and rebaseline replication target. This is strongly inadvisable. The target database is still unaffected and can therefore be transitioned. The following outlines this generic procedure, please refer to the Fathom Replication User's Guide for specific information particularly for OpenEdge versions where this utilitiy has been further enhanced.
Steps:
Presuming that the target database is still running, but the Server (RPLS) has stopped.
1.) The Agent needs to be in Pre-Transition state, so verify the status of the RPLA either:
$ dsrutil target -C monitor
A. Replication agent status
" State: Pre Transition "
or parse the target.lg file for message:
RPLA 5: A TCP/IP failure has occurred. The Agent's will enter PRE-TRANSITION, waiting for connection from the Replication Server. (11699)
if the agent is NOT in Pre-Transition state, force this state as follows:
$ dsrutil target -C triggertransition agent
NOTE: The target database needs to still be online. You cannot trigger transition if the Agent is still connected to the replication server.
2.) Depending on how far behind the target was from the source, you may want to roll forward ai files not already applied to the target database.
[source]: get the current state of each source.an file
$ rfutil source -C aimage list
only ai files with Status = LOCKED & BUSY are relevant
[target]: find the last ai file that was being applied
$ dsrutil target -C RECOVERY agent > recagent.out
The following information from 'recagent.out' is relevant to this Example:
Replication local agent information:
Last Block: Incomplete
ID of the last TX begin: 1634
ID of the last TX end: 1635
Time of last TX end: Thu Jan 12 13:16:08 2006
After Image File Number: 6
Completly Applied to Target: No
[target]: roll forward ai notes before transitioning the target database
$ dsrutil target -C ApplyExtent source.a6
the target.lg file will show similar messages to the following upon sucessful completion:
RPLA 5: Application of Source database AI Extent source.a6 has begun.
RPLA 5: Retry transaction point located at dbkey 0 note type 13 updctr 0. (6806)
RPLA 5: Retry point located at dbkey 662272 note type 25 updctr 6. (6807)
RPLA 5: Source database AI Extent source.a6 has been applied to this datab.ase.
3.) Transition the target database, roll back recovery will be performed and then the target database will be shut down.
$ dsrutil target -C transition agent
4.) The target database is now a non-replication enabled database (ie a "normal" database) which can now be accessed and used to re-baseline replication..