Consultor Eletrônico



Kbase P32755: Error 735 Caused by Database Shutting Down without AppServer Shutting Down
Autor   Progress Software Corporation - Progress
Acesso   Público
Publicação   16/03/2011
Status: Verified

SYMPTOM(s):

Incomplete write when writing to server (735)

State-reset server failed to reconnect to database <name>. (7980)

Database <db> not connected (1006)

State-reset app server error out with above error message

Restarting the database will not immediately stop these messages from the appserver.

Database down without app server shut down

Can not trap error in connect procedure for app server

FACT(s) (Environment):

The appserver is connected client/server to the database
Progress 9.1x
OpenEdge 10.x
All Supported Operating Systems

CAUSE:

When an AppServer is in state-reset mode, it will "reset" itself for each client connection.

More correctly, it will reset itself when a client disconnects itself from the appserver. The reset process will return the AppServer to the state it was in when it was first started. This includes connecting to any databases that were connected from the AppServer startup parameters (effectively using the startup parameters to reconnect to the database).

The following are what happens when a database shuts down while the appserver is still connected to it.

If the appserver is connected client/server to the database, the appserver will not realise that it has lost connection to the database until the next client connects to the appserver. Once it connects and tries to run a program on the appserver, the appserver will error out with error 735. It has tried to access the database through it's socket connection, and the database is no longer responding.

Once the client disconnects from the appserver, the appserver will reset itself. This includes trying to reconnect to the database that is down. If the database is still down, the appserver will fail to connect to it.

At this point, the following message will be displayed in the .server.log file:
State-reset server failed to reconnect to database <name>. (7980) The appserver will then make itself available to handle more client connections.

On the next client connection, regardless of whether the db connection is client/server or self-service, the appserver will fail to run a program, giving the (1006) error.

Again, once the client disconnects, the appserver will reset itself. These (1006) and (7980) errors will continue until the database is restarted, or the appserver is stopped.

Restarting the database will not immediately stop these messages from the appserver. The appservers have already been reset at that time, and have failed to connect to the database.

FIX:

The suggested method to do this is to use the connect procedure. The connect procedure is run by the appserver on each client connection, before processing any programs for the client. In this procedure, you can use the 4GL to check if the appserver is connected to the database. If not, you can use RETURN ERROR to deny connection to the client process. The client code can then trap this connection failure, and handle it gracefully.

If the database is not connected, try to connect to it from the 4GL (with a NO-ERROR), then check again if it is not connected.

This will eliminate the one cycle of appservers required when the db is first brought up. The disadvantages of this are that the client connection will take longer to process when the db is down (has to wait for the db connection to time out), and the db connection parameters will be hard-coded.


Alternatively:
Have clients connect to the appservers then disconnect again, which resets the appserver again. The appserver will then be able to connect to the database.

This means that once the database has been restarted, each appserver must have a single client connection (with the (1006) message) before it can reconnect to the database and resume normal processing.