Kbase P36402: Why a Database shuts down with the message User died holding shared memory lock?
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  21/05/2009 |
|
Status: Verified
GOAL:
Why a Database shuts down with the message User died holding shared memory lock?
GOAL:
What happens when a user connected to the database through shared memory is killed?
GOAL:
User died holding shared memory or buffer latch / lock
GOAL:
Disconnecting dead user . (2527)
GOAL:
User died holding shared memory locks. (2522)
GOAL:
User died with buffers locked. (2523)
GOAL:
Begin ABNORMAL shutdown code x (2249)
GOAL:
System Error: redundant lwake user <n> latch <x>
FIX:
When a user connects to a Progress database in multi-user mode, the user control table in shared memory is updated to add the user. The Progress watchdog, or the server if no watchdog is running, periodically checks to ensure all users listed in the user control table have corresponding processes running.
If the watchdog or server determines that a user listed in the Progress user control table does not have an active process, the user is disconnected from the database and the (2527) message is written to the database log file. For Example: Disconnecting dead user . (2527)
If the user's process terminated abnormally while holding latches / locks in shared memory, the below messages are written to the database log file:
User died holding shared memory locks. (2522)
User died with buffers locked. (2523)
If shared memory or buffer latches are left in an inconsistent state, the server will shutdown the database to ensure data integrity. Message (2249) is written to the database log file followed by all users being logged out and the server shutting down. For Example: Begin ABNORMAL shutdown code x (2249). The code number in the message above is for future use and currently has no meaning.
The shutdown does not indicate any data corruption and occurs to prevent the possibility of corruption. Upon re-starting the database, the database will go through crash recovery and the database will be ready for normal use.
In addition to the above messages, message System Error: redundant lwake user <n> latch <x> may be written to the .lg file as the abnormal shutdown proceeds and logs out all the users. This error also does not indicate database corruption.
Events that cause a process to terminate abnormally include:
1. Sending a kill signal other than SIGHUP to the process.
2. Shutting off a terminal while an active Progress session is running. If the terminal sends SIGHUP when shut off this problem will not occur.
3. The process aborting as a result of a system error.
The signal handling in PROGRESS has been improved to include more signals. If this error occurs repeatedly, it is important to determine what is causing the user process to die.
For sites where this occurs frequently, a way to prevent an abnormal shutdown is to run users as remote clients. For users that log directly into the database system connecting remote causes the user processes not to access shared memory directly. Instead the client connects to a server process which accesses shared memory. The server process will obtain shared memory and buffer latches for the user process. If the user process abnormally terminates, the server process is still running and can clean up any remaining latches.
Examples:
1) start server with network parameters: proserve db-name -S <port number> -H <host name>
2) start user session as local "remote" client from same machine: mpro db-name -S <port number> -H <host name>
Note: There might be differences in performance between TCP and shared memory connections. Shared memory will usually be faster since it does not go via the TCP protocol.