Kbase P23047: Events that could cause a process to terminate abnormally
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  11/13/2007 |
|
Status: Verified
GOAL:
What causes Abnormal shutdown of the database?
GOAL:
What triggers Begin ABNORMAL shutdown code x (2249)
GOAL:
Begin ABNORMAL shutdown code (2249)
GOAL:
Why does the broker Disconnect dead user (2527)
GOAL:
Disconnecting dead user <number>. (2527)
GOAL:
How can users die with shared memory locks? (2522)
GOAL:
User <num> died holding <num> shared memory locks. (2522)
GOAL:
Why do users die with buffers locked? (2523) (5027) (5028)
GOAL:
User <num> died with <num> buffers locked. (2523)
GOAL:
User <num> died with <num> buffers locked. (5027)
GOAL:
SYSTEM ERROR: Releasing regular latch. latchId:<latch-num> (5028)
GOAL:
Events that could cause a process to terminate abnormally
FACT(s) (Environment):
All Supported Operating Systems
Progress/OpenEdge Versions
FIX:
When a user connects to a Progress database in multi-user mode, the user control table in shared memory is updated to add the user. The Progress Watchdog, or the server if no watchdog is in use, periodically check to make sure that all users listed in the user control table have corresponding OS processes running. If the WatchDog detects that a user process is listed in the Progress user control table but does not have an active OS process, the shared memory user/remote server will be disconnected from the Progress database and the following message will be written to the log file:
BROKER 0: BROKER detects death of server 6464 (1153)
WDOG 12: Disconnecting client 55 of dead server 1 (2526)
BROKER 0: Disconnecting dead user 41 (2527)
However, if the Progress Watchdog detects that shared memory or buffer latches have been left in an inconsistent state by this "dead process", the broker will then shut the database down to protect the integrity of the database. The following message will be written to the log file, followed by all users being logged out and the server shutting down:
03:19:55 BROKER 0: Begin ABNORMAL shutdown code 2 (2249)
This error in itself does not indicate database corruption, merely that measures have been taken by the broker to prevent potential corruption. When re-starting the database, crash recovery will be done and the database will be ready for normal use if no errors are reported during the crash recovery. NOTE: crash recovery may take some time, especially if during the Physical and Logical undo phases, there are uncommitted transactions to roll back.
In addition to the above errors, some of the following messages may be written to the .lg file by the WDOG, if enabled or the Primary login BROKER if not, as the abnormal shutdown proceeds and logs out all the users:
SYSTEM ERROR: redundant lwake user <n> latch <x>
SYSTEM ERROR: bkrlsbuf: cannot release buffer lock, use count 0 is invalid. (1051)
These errors do not indicate database corruption.
Events that could cause a process to terminate abnormally include:
1. sending a kill signal (other than SIGHUP) to the process
2. shutting off a terminal while in an active Progress session and the terminal (tty or PC with terminal emulation) does not send a SIGHUP
3. the process aborting as a result of a system error (reported in the log file as error 49 for example). This abnormal termination of the process will still hold latches (locks) in shared memory and a message similar to one of the following will appear immediately prior to the (2249) message:
BROKER 0: SYSTEM ERROR: User 66 died during microtransaction. (2256)
Usr 41: User 41 died holding <num> shared memory locks. (2522)
Usr 41: User 41 died with <num> buffers locked. (2523)
Usr 41: User 41 died with <num> buffers locked. (5027)
SYSTEM ERROR: Releasing regular latch. latchId:<latch-num> (5028)
For those sites where this occurs frequently, one way to prevent an abnormal shutdown is to run all users as "remote" clients (i.e. not shared memory, but TCP/IP) even if they are logging in directly to the machine where the database is. In this way, the user processes do not access shared memory directly. Instead they connect to a server process which accesses shared memory. The
server process obtains shared memory and buffer latches for the user process and so if the user process abnormally terminates, the server process is still running and can clean up any remaining latches.
When users dying holding shared memory or buffer latch (lock) occurs repeatedly, it is important to determine the root cause of the user process dying and to resolve that problem.