Consultor Eletrônico



Kbase P109337: Fathom Replication: source database goes down with errors 6091 827 on ai files
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   6/28/2007
Status: Verified

FACT(s) (Environment):

OpenEdge Replication 10.x
Fathom Replication 3.0A

SYMPTOM(s):

Source replication database goes down with errors 6091 and 827 on after-image file

<function>:Insufficient disk space during <system call>, fd <file descriptor>, len <bytes>, offset <bytes>, file <file-name>. (6091)

** rlaixtn: Insufficient disk space to extend the after-image file. (827)

file-name in error 6091 refers to an after-image file dbname.a1

The Target replication database has shut down previously

-aistall is not in use on the source database

Failed to switch to next after-image extent. (3784)

CAUSE:

The error message 827 only occurs when trying to extend a Variable-length after-image extent. The extent cannot be extended having reached filesize limits or running out of diskspace AND the extent is the only ai extent, or the next ai extent in the ai sequence is not a free "EMPTY" ai extent. In other words the remaining ai extents are either "LOCKED" or "FULL" and therefore not available.

This scenario can also occur with FIXED ai extnets, when the current BUSY extent needs to switch to the next ai extent whose status is either FULL or LOCKED.

Under the Fathom Replication model, when the replication agent of the target database terminates and/or the target database server goes down, the replication server (RPLS) on the source database will also terminate after the connect-timeout has expired. At this stage, the source database is still running and so is the after-imaging. The ai files continue to fill up during this time recording the database activity. As they switch to the next ai extent, their status changes from "BUSY" to "LOCKED" under the replication model. The "LOCKED" status will/can only change when the target database is restarted (and therefore the RPLA) and the "dsrutil source -C restart server" on the source database, so that the RPLS can connect to the RPLA and begin to apply the ai notes at a block level where it last left off.

In other words: Whenever a "FULL" ai file has not been applied to the target database, it will be in the "LOCKED" status until such time as it has been applied to the target database. Once it has been appied, it will then be changed to the FULL status when it can then be made available again with the "RFUTIL source -C aimage empty". This is how the model works. There is no way to change the LOCKED status to anything else while Fathom Replication is enabled. It is therefore imperative to monitor the after-image extent availability and during times when the RPLS and RPLA have lost connection take proactive measures.

FIX:

There is no need to disable replication on the source database, are two possible ways to recover from this scenario depending on the current status of the ai extents. Both methods essentially involve making more
ew ai extents available in order for the source database to continue operations while the target database is recovered, synchronised and ai notes can continue to be applied eventually bringing the target inline with the source.

It is worth stopping ai switch batch/cron jobs during this recovery operation.
The current status of the ai extents can be queried with "RFUTIL source -C aimage list"
Without the -aistall startup parameter on the source database, the source database would have shut down as well so this is the start-point of the methods outlined below.

a.) /IF/ there are any FULL ai extents:
- manually marked these as empty, "rfutil source -C aimage empty"
- then restart the target and source databases

NOTE: if -aistall had been in place, it would only have been necessary to restart the RPLS with the "dsrutil source -C restart server" as the source database would still have been running but no updates allowed until ai extents became available.
b.) /IF/ there were no FULL ai extents, in otherwords "FULL" ai extents are currently marked LOCKED until replication resumes (except of course the current one BUSY), and there were still available EMPTY ai extents, but no diskspace available:

- shut the source database down, "proshut source -by" if -aistall is in use otherwise the source database will already be down.
- move the ai extents that were available (EMPTY) but had no diskspace and the current "BUSY" ai extent to another disk
- run: "prostrct list source source.st"
- edit source.st to reflect the new absolute file location of the moved ai files
- run: "prostrct repair source source.st"
- run: "prostrct list source source.st" again and check the resulting source.st to ensure that the control area of the source database knows where the ai files are where they have been moved to
- start the source database; start the target database.

c.) This Option is only available to Progress 9.1E or Open Edge 10.0B (and later). Please refer Progress Solution: P71887 "Unable to switch to new ai extent after adding a new ai extent to the database"
/IF/ there were no FULL ai extents, in otherwords all ai extents were marked LOCKED (except of course the current one BUSY):
- shut the source database down, "proshut source -by" if -aistall is in use otherwise the source database will already be down.
- add more ai extents by running: "prostrct add source addai.st" where addai.st defines where the new ai files will be placed. Note, ai extents can be added anywhere there is disk space available.
- run: "prostrct reorder ai source" to ensure that the EMPTY ai extents immediately follow the current BUSY ai extent
- start the source database; start the target database.

After applying the method particular to the current scenario, once the target database is synchronised with the source, the ai notes will be processed against the target database while activity is allowed to continue on the source database. As soon as each "LOCKED" ai extent has finished being processed, it will be marked "FULL" and therefore available again once they were marked EMPTY with "rfutil souce -C aimage empty". The progress of this activity can be monitored with the "DSRUTIL target -C monitor", Option A: Replication Agent. The key factor in this scenario is the availability of ai files during times when replication has ended and normal processing continues against the source database.