Consultor Eletrônico



Kbase P122873: All after imaging (AI) areas are in LOCKED state after a network failure
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   6/1/2010
Status: Verified

SYMPTOM(s):

All after-imaging (AI) areas are in LOCKED state after a network failure

Backup fails because it can't switch to an empty AI area

Source database log file shows errors at the time of the network failure:

Connection failure for host <host> port <port> transport TCP. (9407)

A communications error -4008 in rpCOM_RecvMsg. (11713)

A communications error -157 occurred in function rpNLS_SendAIBlockToAgent while sending AIBLOCK. (10491)

The Fathom Replication Server is beginning recovery for agent agent1. (10661)

Connecting to Fathom Replication Agent agent1. (10842)

The Fathom Replication Agent agent1 cannot be contacted by the database broker on host <host>, port -1. (10496)

The connection attempt to the Fathom Replication Agent agent1 failed. (10397)

The Fathom Replication Server was unable to reconnect to agent agent1. Recovery for this agent will not be performed. (10697)

The Fathom Replication Server will shutdown but the source database will remain active. (10698)

The Fathom Replication Server is ending. (10505)

There are no available EMPTY AI extents. Database activity is stalled until an AI extent becomes available. (12288)

Can't switch to after-image extent <extent> it is full. (3775)

!!! ERROR - Database backup utility FAILED !!! (8563)

FACT(s) (Environment):

OpenEdge Replication
Fathom Replication
Progress 9.1D
Progress 9.1E
OpenEdge 10.x
All Supported Operating Systems

CAUSE:

The network outage exceeded the value specified for the connect-timeout parameter.

If the connect-timeout parameter is exceeded, the replication server process (on the source database) automatically shuts down. This stops OpenEdge/Fathom Replication from replicating however normal database activity can continue to occur. All new activity is built up within the After Image files and if replication is not re-enabled prior to running out of AI space then all AI files will become LOCKED and the database will not be able to accept any new transactions.

Progress backups (probkup) will fail as it will automatically attempt to switch to an empty AI area, which it can't do because all AI areas are currently LOCKED.

FIX:

There are three options on how to resolve this problem:

Option 1:

Restart both the replication server and replication agent.

Once the databases have synchronized it may take some time for all of the data within the oldest AI area to be replicated. Only after all of the data in the oldest AI area has been replicated and the area has been transitioned from LOCKED -> FULL -> EMPTY will the database be able to an empty AI area and normal transaction activity can resume.

Pro: No additional AI areas need to be added to the database.
Pro: Replication does not need to be re-created / re-initialized.
Con: All data from the oldest AI area needs to be replicated before normal database transaction activity can resume.

To restart the replication server you can either stop and restart the source database, or if the database is currently running you can run the following command:

dsrutil <dbname> -C restart server

To restart the replication agent, shutdown and restart the target database.


Option 2:

Add additional AI areas to the database. There are some caveats to using this option depending on the version of Progress being used.

Pro: Database activity can resume as soon as new AI areas have been added.
Con: Disk space may not be readily available for additional AI areas.

If you are using Progress pre-9.1E, 10.0B,or 10.1A adding additional AI areas will only resolve the problem if the current BUSY AI area is the last physical AI Area.

After Imaging works sequentially in a ring. So say you have AI areas A,B,C and areas A and C are LOCKED and B is Busy. If you add new AI areas D and E, after imaging can't jump from B to D to get to an empty area.

If you are using Progress 9.1E or higher, or OpenEdge 10.1B03 or higher, you can add new AI areas and then reorder the AI extents which will place the new empty AI areas immediately after the BUSY area.

a) Create a structure file containing only the AI areas that you wish to add to the database, and add them.
b) Use the following command to re-order the AI areas:

prostrct reorder ai <dbname>


Option 3:

Disable replication and then re-initialize replication.

By disabling replication, all AI areas will become FULL and can then be archived and emptied. You will then need to re-enable replication and re-create the target database.

Pro: Quick to re-establish normal database transaction activity.
Pro: No additional AI areas need to be added to the database.
Con: Must re-enable replication and rebuild (re-initialize) the target database.

To disable replication on the source database, run the command:

dsrutil <dbname> -C disablesitereplication source