Consultor Eletrônico



Kbase 16802: Watchdog and _progres cleanup ; Debug for HANGUP Signals
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   30/04/2004
Solution ID: 16802

GOAL:


Watchdog and _progres cleanup ; Debug for HANGUP Signals

FIX:


Overview:
 
If customers report that their users are shutting off their terminals, closing their window or losing a modem connection while still connected to Progress and the _progres session either goes runaway or is not being removed, please follow this process to debug the situation. You can also use this process to debug issues where a client appears to terminate in an acceptable manner (ie. those listed above) yet the watchdog is bringing down the broker aftercleaning up the process.
 
What should happen:
 
+------------+
| terminal   |
|            |
+------------+
|
this connection
being dropped
should send a
kill -1 or a signal 1
to the broker process
|
kill (-1)
|
+------------+
| Broker     |
|            |<.o:p>
+------------+
 
This should log a Hangup Signal Received in the .lg file, it should be followed by a Logout and the process at the OS level should terminate.
 
Progress should be able to normally handle and disconnect processes that terminate abnormally in situations where a kill -1 is sent to the process. The signals that cause problems for us are kill -9 and kill -8 which we can't trap.
 
 
How to Debug:
 
1)       Test the situation against the demo Database. Start up a client session to the demo database, and drop the session in the same manner that is causing the problem.
 
2)       Re-connect or use another window to check the .lg file to see if the HANGUP signal is recorded.
 
a)       If the HANGUP signal is there...
i)         If it is followed by a logout then the process should not be listed at the OS level and things are working normally.
ii)       If it is NOT followed by a logout, this is a BUG. Stop debugging.
b)       If the HANGUP signal is NOT there, we need to find out where it is getting lost.(step 3)
 
3)       Use our sigtst.c program available in Progress Solution 17080 which will help to determine what signals are being sent from your terminal when the connection is dropped. The directions on how to run it and interpret the results are included in the kbase. It is a short C program with instructions on how to compile. The test takes less than 5 minutes..
 
4)       sigtst.c should show that the disconnection is sending a kill -1 or a hangup signal.
a)       If it is NOT sending a kill -1, then this is the problem. This is NOT a Progress bug but it is a problem for the terminal or terminal emulation vendor to solve. They must send a kill -1 when the connection is dropped. Stop debugging.
b)       If it IS sending a kill -1, we need to find out why it isn't making it through to the broker.(step 5)
 
5)       How are they starting the client process? Are they using their own script? If yes, check to see if they are trapping any signals in their startup script. This is evident by the word "trap" prior to executing _progres. If this exists, remove the trap statement, retest the situation to see if this changes the behavior.
 
6)       Check to see how _progres is being started. They should be using "exec _progres" rather than "_progres" without the exec. We have seen cases where this has prevented signals from getting through.
 
7)       Is the _progres executable one that was probuilt using HLC or HLI? If yes, find out if the C code that was compiled with it did any signal handling. This may be causing the problem and the code should be re-evaluated by the provider of the module.
 
Most cases where customers report that processes are not terminating after shutting off a terminal or after a modem line is dropped can be resolved by running through this process.
 
 
Additional information:
 
Sometimes when a process becomes "runaway", a kill -1, -2, -15 does nothing. This may be because the client process is in the middle of some sort of database operation that can't be interrupted. In our code we have a flag that we turn on when performing such operations. The flag is usually on for a relativ.ely short period of time. While the flag is on, certain signals like hangup, interrupt, or terminate are trapped but ignored by our signal handler. Once the flag is off, those signals are handled as they usually are. It is possible the process is caught inside this code and 'could' be a Progress bug. .