Kbase 21568: Webspeed 3.1x - more digit PID support
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  10/16/2008 |
|
Status: Unverified
FACT(s) (Environment):
WebSpeed 3.1x
IBM AIX POWER 5.1
Compaq Tru64 UNIX
SYMPTOM(s):
WebSpeed agent fails to start in a Tru64 clustered and AIX 5.1 environment
On other platforms, than Tru64 UNIX, WebSpeed agent fails to start on newer OS releases.
CAUSE:
The issue is related to using a newer operating systems where unusual number of digits are used in PID construction.
In Compaq True 64 V5.0A and above, the WebSpeed startup could fail because of the size of the PID value for the agents. This is due to the cluster configuration where the PID value includes cluster node information, causing it to be larger than 6 significant digits.
When WebSpeed agent starts up, it sends a message to the broker. The message contains the PID of the agent. The broker uses this PID to periodically check to see that the agent hasn't unexpectedly terminated.
In 3.1C (and all earlier releases) the format of the message containing the PID allowed for a PID value which was 6 digits (or less). If the PID value is greater than 6 digits, then the broker mistakenly extracts only the first 6 digits, giving it the wrong PID value. The erroneous PID value does not match up to a running process, so the broker mistakenly concludes that the agent has terminated.
In broker log file, you will see
S-0001>(26-Nov-01 08:24:06:872) Server Process 114040 has terminated.
In server log file, you will see
main>(26-Nov-01 08:24:05:061) /app/psc91Bwork/wsbroker1.server.log opened.
S-0001>(26-Nov-01 08:24:06:870) [114040] WTA server initializing. (8835)
S-0002>(26-Nov-01 08:24:07:836) [114040] WTA server initializing. (8835)
S-0003>(26-Nov-01 08:24:08:837) [114040] WTA server initializing. (8835)
S-0004>(26-Nov-01 08:24:09:838) [114040] WTA server initializing. (8835)
S-0005>(26-Nov-01 08:24:10:839) [114040] WTA server initializing. (8835)
Note: the PIDs of the agents shown as the same but they should be different as can be seen from ps listing from OS.
There are no Java exceptions or error messages.
On AIX 5.1 where the problem was originally found, the PID values can
exceed 16 bits, which can cause the value to exceed 6 decimal digits.
On a TRU64 cluster, the cluster ID is encoded in the upper digits of the PID and therefore always causes the PID value to be larger than 6
decimal digits.
FIX:
Upgrade to WebSpeed 3.1C and install the latest patch.