Consultor Eletrônico



Kbase P14220: How to debug PROGRESS process hang.
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   2/17/2009
Status: Verified

GOAL:

How to debug PROGRESS process hang.

GOAL:

How to troubleshoot PROGRESS processes appearing to hang.

FIX:

The PROGRESS monitor utility, promon, is mentioned in the debugging steps below. When running promon to check hung processes, it should be run with the -F option. The -F option runs the promon utility without using the login semaphore.

Note that promon -F is not the same as forcing in to the database with the -F qualifier to proutil truncate bi and, unlike proutil truncate bi -F, does not pose any risk to database integrity.
Which is not to say that promon with -F 'safe'. It should not be used in any circumstances other than in an emergency when the database and all processes connected to it are completely hung. Using -F with promon, will make the executable skip locking the login semaphore (ie: reduces the latch count for USR _connect table). If other processes are logging in while _mprshut is logging in with -F results can be unpredictable. Among other things the possibility that two connections could end up using the same usrctl which is likely to cause all sorts of problems.

It should only be used if all the other processes and the database appear to be hung as this Solution discusses.


When a process appears hung, the following should be checked:

1) Record Locking Table in promon:
Check to see if the user is waiting to obtain an exclusive lock on a record that is share locked or exclusively locked by another user. If so, this may be what is causing the "hang".

If this is the case, there are two options: The first is to let the user wait for the user who owns the lock to release it. The second is to disconnect one of the users, either the one waiting for the lock or the one who holds the lock. The user can be disconnected either from the shutdown menu in promon or from proshut.

2) Circular RM Chain
When new records are created or existing records are updated, PROGRESS searches the record manager (RM) chain of the database looking for space in a block to put the new data. The RM chain contains all blocks in the database that have some free space. Each block on the RM chain points to the next block in the chain. If one of these pointers becomes corrupted, instead of sequentially searching through the RM chain, the process may end up looping through the RM chain. Other processes that then access the same block while updating, may also hang.

To determine if this is the cause of the hanging process, do the following:

2.1) Run the "ps -ef" UNIX command and check to see if the CPU time of the process is increasing. If not, then this case is not the cause of the hangs.

2.2) Run promon and enter "R&D" at the prompt for the selection.
Choose option 2, Activity Displays, then option 10 - Space Allocation. Look at the number reported for "rm blocks examined". If the number is very high, in the thousands, then the database may have a circular RM chain.

2.3) If the process is using CPU time and the rm blocks examined is high, then the RM and Free chains in the database should be rebuilt:

a) truncate the BI file
b) backup the database **Very important**
c) run "proutil <dbname> -C dbrpr
d) Select the "Database Scan Menu"
e) Enter 7 to turn on Rebuild RM chain.
f) Enter 8 to turn on Rebuild Free chain.
g) Enter "G" for GO to start the rebuild.

The time to rebuild the RM and Free chains is dependent on the size of the database and the speed and load of the machine. It is not a very time-consuming operation. For very large databases it can take 2 hours or more.

3) Troubleshooting Hang
If none of the above are likely causes of the hang, the following should be checked:

3.1) Is the process using CPU time? Is the process doing i/o?
-If so, then doing a "kill -USR1 <process-id>" to the process will generate a core file and, depending on your platform, a protrace.
-It may be that the application is looping. The 4GL code should also be checked.

3.2) Has anything been chang.ed on the system or in the application? If so, investigate the changes.

3.3) Are there any errors reported in the database log file?

3.4) Are only PROGRESS processes hanging or are there other processes hanging as well? If others as well, then the operating system vendor should be contacted and the system checked out.

3.5) Is the hang reproducible? If so, determine what the steps are that result in the hang..