Consultor Eletrônico



Kbase P21490: Killing a hung batch process leaves the database in a hung state
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   11/11/2008
Status: Verified

SYMPTOM(s):

Database is down

Batch process is killed

Zombie _mprosrv process

Several _mprosrv processes from ps -ef

Cannot access database with Promon

Proshut db-name -by -F hangs

FACT(s) (Environment):

UNIX
Progress/OpenEdge Product Family

CAUSE:

Killing the batch process has left the database broker and server processes hanging, therefore the database cannot be accessed through shared memory connections.

FIX:

If possible, backup the database with an OS copy. Any other databases on the system should be properly shutdown and the server rebooted to clear the memory. Any scripts that start these databases should be disabled. when the server comes back up, the problem database can be accessed with

pro db-name

This will force crash recovery, which can be confirmed in the db.lg file:

Begin Physical Redo Phase at 0 . (5326)
Physical Redo Phase Completed at blk 0 off 191 upd 0. (7161)

Now run proserve. This should start the database in multi-user mode. The other databases can now be started as well.

If the database fails to start and gives an error message, check the Solutions for a possible resolution. If the database is unable to start, restore the last good backup and roll forward the ai files (if using after imaging).

A last resort before going to backup, is to issue a kill -8 on the broker process (_mprosrv) for the problem database. To determine the broker process, run ps -ef | grep db-name
The database broker is _mprosrv as are the spawned server processes, and will not have the -m1 parameter as the spawned servers do:

root 16688 1 0 08:49:46 - 0:00 /progsv73e/dlc/bin/_mprosrv <<broker


root 21636 13698 0 09:05:41 - 0:00 /progsv73e/dlc/bin/_mprosrv -m1 <<spawned server

A "Zombie" process might look like this:

root 28692 1 0 08:58:01 - 0:01 /progsv73e/dlc/bin/_mprosrv -m1

Notice there is only one PID, 28692, as for the broker, but the -m1 indicates it is a spawned server (child process).It should have 2 PIDs listed as for the spawned server above, but it is in a "zombie" state.


The kill -8 on the broker process _mprosrv will likely give several errors, Error writing msg, socket=<n> errno=10054 usernum=<n> disconnected. (796)

Then run pro db-name and proserve as mentioned above.