Kbase P115187: semAdd () CreateEvent Error causes database to shutdown abnormally
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  6/3/2008 |
|
Status: Verified
FACT(s) (Environment):
Progress 9.1E
OpenEdge 10.1A
Windows
OpenEdge 10.0B
OpenEdge Category: Database
SYMPTOM(s):
Database shuts down after receiving a semAdd() error.
Character based application on Windows
semAdd() CreateEvent NT System Error: 5
semAdd() CreateEvent NT System Error: 0, 5: Access is denied
semAdd - WTFMO(1404) dwRet(-1) - NT System Error = 6
Stacktrace from _mprosrv reads:
dsmContextWriteOptions
Stacktrace from _progres reads:
utConvertToIx
Stacktrace from kernel32.dll reads:
dbut_bufmov
The database only shuts down when the character based self-service client throws a semAdd() CreateEvent error and is either holding shared memory locks or mid microtransaction
Database crashes with error semAdd() CreateEvent and sometimes associated with 5026 or 5027 or 2255
semAdd() CreateEvent messages occur regularly in the database log file
User <num> died holding <num> shared memory locks. (5026)
User <num> died with <num> buffers locked. (5027)
SYSTEM ERROR: Incomplete microtransaction. (2255)
Application event log on the server shows error 354
Errors are not associated with lack of memory resources on the server
no error 354 permission conflicts for users accessing the Temporary Directory (-T)
Unable to open or create C:\temp\lbi , error 13. (354)
CAUSE:
Bug# OE00128751
CAUSE:
This CreateEvent issue is the result of a code change made (9.1D09) because character mode clients were not being able to break out of a wait for a record that was held by another user. The previous method of waiting on a semaphore was found not to work when a character mode client was waiting for a record locked by another user. They would hit ctrl-c, ctrl-brk, etc and nothing would happen. This functionality, which works on Unix, was not working on Windows because the WaitForSingleObject call would not stop once our signal handler processed the ctrl-brk. It would return control back to the WaitForSingleObject, which was called with an infinite wait.
The CreateEvent replaced the old code so that one of the events we wait for is the ctrl-brk, and if we get it, we react to it accordingly. This still caused an access denied error on some Windows character applications, even when record contention was not involved. This access violation is caused by either:
- conflicts in event names being stored in the same global name space on the terminal server, even when each event created has a unique name based on time.
- Windows treating a thread spawned by a signal as a child and a child process can't inherit the event handle if there's no default security attribute structure on the CreateEvent.
This resulted in situations where the CreateEvent call fails. The process that received the error does not exit cleanly and we bring down the database.
FIX:
Upgrade to OpenEdge 10.1B, where the CreateEvent
The actual problem itself, is that self-service users are disconnecting their session while they have locks still in memory and for the reasons above, we cannot clean it without compromising integrity, therefore we failsafe and shut down. The sub-text here, just to make it clear, is about handling 'backout' of an abnormally terminated client session. For example, most often when a client receives the : record in use by... wait or press ctrl-brk" message, then eventually "X-terminates" the session or end-task (aka: kill). It is strongly advised to investigate why users find it necessary to terminate their application session by such means.
A work around for this issue is to change to a client server environment by starting the database with the -H <hostname> -S <port number>. Clients would then connect with the same -H and -S parameters as the database was started with. This would create a client/server connection instead of a Self-Service client connection to the database.