Consultor Eletrônico



Kbase 14147: How to debug PROGRESS process hang
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   5/10/1998
How to debug PROGRESS process hang

This notebook entry describes how to troubleshoot the problem of
one or more PROGRESS processes appearing to hang.

The PROGRESS monitor utility, promon, is mentioned in the debugging
steps below. When running promon to check hung processes, it should
be run with the -F option. The -F option runs the promon utility
without locking shared memory. [NOTE: promon -F is *not* the same
as forcing in to the database with "pro -F" and does not pose any
risk to database integrity as does forcing in.]

When a process appears hung, the following should be checked:

1) Record Locking Table in promon:
Check to see if the user is waiting to obtain an exclusive lock on
a record that is share locked or exclusively locked by another user.
If so, this may be what is causing the "hang". (Check the Record
Locking Table" menu in promon.)
If this is the case, there are two options: The first is to let
the user wait for the user who owns the lock to release it. The
second is to disconnect one of the users, either the one waiting for
the lock or the one who holds the lock. The user can be disconnected
either from the shutdown menu in promon or from proshut.

2) PROGRESS 6.2 MT Lock Holder:
If the version of PROGRESS is 6.2, check to see if one user is
holding the MT lock for longer than usual. ***Kbase entry 12895
describes in detail how to check if the MT holder is the result of
other users hanging and what to do if this is the case.
The MT lock is a shared memory lock that is quickly passed among
all the users connected to the database. If one user holds the MT
lock, other users must wait for that user to finish before making
their updates.

3) Circular RM Chain
When new records are created or existing records are updated,
PROGRESS searches the record manager (RM) chain of the database
looking for space in a block to put the new data. The RM chain
contains all blocks in the database that have some free space.
Each block on the RM chain points to the next block in the chain.
If one of these pointers becomes corrupted, instead of sequentially
searching through the RM chain, the process may end up looping
through the RM chain. Other processes that then access the same
block while updating, may also hang.
To determine if this is the cause of the hanging process, do
the following:
1) Run the "ps -ef" UNIX command and check to see if the
cpu time of the process is increasing. If not, then
this case is not the cause of the hangs.
2) Run promon and enter "R&D" at the prompt for the
selection. Select "S" to sample activity counters for
30 seconds. In V6, choose option 11 - Space Allocation.
In V7+, choose option 2, Activity Displays, then
then option 10 - Space Allocation.
Look at the number reported for "rm blocks examined".
If the number is very high, in the thousands, then the
database may have a circular RM chain.
3) If the process is using cpu time and the rm blocks
examined is high, then the RM and Free chains in the
database should be rebuilt:
a) truncate the BI file
b) backup the database **Very important**
c) run "proutil <dbname> -C dbrpr
d) Select the "Database Scan Menu"
e) Enter 7 to turn on Rebuild RM chain.
f) Enter 8 to turn on Rebuild Free chain.
g) Enter "G" for GO to start the rebuild.
[The time to rebuild the RM and Free chains
is dependent on the size of the database and
the speed and load of the machine. It is not
a very time-consuming operation. For very
large databases it can take 2 hours or more.]

4) -spin in Version 6.3E and 6.3F:
There is a bug in 6.3E and 6.3F that may result in all PROGRESS
users connected to the database to hang when the -spin broker startup
parameter is used. Patches to 6.3F are available for this problem.
To determine if -spin is the problem: run promon and enter "R&D"
for the selection for the prompt in the main menu. Select "S" to
sample activity counters, then look at different menu options to
look at the database activity. If there appears to be no activity
at all in the database, then -spin is likely the cause of the hang.
If this is the case either do not use the -spin startup parameter
or contact PROGRESS Technical Support for a patch.

5) Troubleshooting Hang
If none of the above are likely causes of the hang, the following
should be checked:
1) Is the process using cpu time? Is the process doing i/o?
-If so, then doing a "kill -8 <process-id>" to the process
will generate a core file. Use a debugger (ex: adb, sdb,
dbx) to obtain a stacktrace of the process. Kbase entries
3167 and 13024 describe how to use debuggers that are
most commonly found on UNIX systems.
-Another option is to attach to the process with a
debugger and get a stacktrace from the active process.
-Send the stacktrace to PROGRESS Technical Support for
analysis.
-It may be that the application is looping. The 4GL code
should also be checked.
2) Has anything been changed on the system or in the application?
If so, investigate the changes.
3) Are there any errors reported in the database log file?
4) Are only PROGRESS processes hanging or are there other
processes hanging as well? If others as well, then the
operating system vendor should be contacted and the system
checked out.
5) Is the hang reproduceable? If so, determine what the steps
are that result in the hang.
6) Call PROGRESS Technical Support.

Progress Software Technical Support Note # 14147