Kbase P156080: WebSpeed agent become orphaned during the online database backup time
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  18/11/2009 |
|
Status: Unverified
SYMPTOM(s):
WebSpeed agent become orphaned during the online database backup time
The database log file shows error 748 during online backup as well
The server or the system has no more resources. Please contact Progress Technical Support. (748)
The WebSpeed broker log shows the following error messages:
ERROR: cannot start server. (8100)
Timeout while listening for server : java.net.SocketTimeoutException: Accept timed out
Killing the agent with kill 8 creates core and the protrace files
The stack trace from the core file reads:
bsdwaitdll
bsdrecvdll
ncsrecvdll
ncsrecv
ncarnet
ncasr_16_14
ncalin()
procon
fdcon
scconx
sccon
drcon
main
The protrace from _progres reads the following repeatedly:
iomsgw
msgout
msgn
WebSendMessage
WebLogError
ioMsgDisplay
msgw
msgdisplay
umEHDisplayMsgsFunc
drexit : 0x00000110
drSigFatal
---STACK
---ID Node 0 Process <process id> Thread 1
---LCB
FACT(s) (Environment):
Online backup is run three times a day, the morning and afternoon being the busy hour for the WebSpeed
Multiple batch processes are run through out the day as well as the Client/server GUI clients connect to the same database as WebSpeed agents connect
WebSpeed 3.1E 64-bit
IBM AIX 5.3
CAUSE:
During the online backup all database operations are suspended until the bi file is backed up. Therefore, the production WebSpeed agents are being launched by the WebSpeed broker to fulfill the messenger requests during the busy time. The agents at that bi back up moment can not connect to the database and waits for the database connection to be completed.
The stack trace on the above protrace file shows is the stack after the kill is executed. At that time, the server is trying to start up and not yet connected to the broker. So when the kill was
executed, starting WebSpeed agent tried to write the message to the broker, but the connection to the broker had been closed ( reflected by the accept timed
out errors on the broker log), and then the agent exited. In version 9.x the WebSpeed agents sends the messages to the broker process which then writes to the server log files. From OE 10.x, WebSpeed agent directly writes to the server log file.
The error 748 is due the lower number -n for the database.
FIX:
Use -PendConnTime set to 15 seconds and increase the -n for the database