Kbase 22009: Unusual Database Failure Can Occur During Heavy Workload
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  5/1/2002 |
|
SUMMARY:
Progress Software has become aware that in certain limited circumstances when running the Progress RDBMS on some new 8-way or higher symmetric multiprocessor based systems with very high user counts, sporadic and unusual database failures may occur when executing heavy workloads. Progress has informed the hardware
vendors of the problem and is following up closely with them to identify corrective actions.
EXPLANATION:
Progress suspects that the failures may be related to the modes of operation of the memory caches on these high-end systems which use newer types of processors and advanced memory-subsystem designs. The affected systems contain IBM POWER4 processors running AIX 5.1, Intel Xeon, Pentium 4, and P6 processors running UnixWare, OpenUnix.
The failures present a wide variety of symptoms consistent with a hardware malfunction, such as a memory or disk controller problem. Symptoms, among others, may include the following:
error message 219 - could not locate find descriptor
error message 49 - memory violation
error message 355 - database block has incorrect recid
error message 3629 - Block xxxx use count underflow
other error messages regarding invalid buffer use counts
processes waiting forever for database buffers
Due to the broad range of symptoms, the exact cause of the failures is extremely difficult to diagnose. If you are experiencing these symptoms, you should examine the operating system log files for hardware related messages and take corrective action as necessary. Contact Progress Technical Support for further software analysis and assistance.
-- IBM AIX 5.1
For IBM AIX 5.1 for the POWER4-based pSeries 690 systems (known as the
"Regatta"), a patch (9.1C15) is available.
-- Intel Pentium 4, Xeon, and P6 based systems
For the Intel Pentium 4, Xeon, and P6 based systems, Progress is investigating the matter with the hardware and operating system vendors. In the interim, a workaround is to use an undocumented startup parameter, -mux 0, when starting the database broker/server in multi-user mode. This parameter has the effect of changing one of the locking algorithms Progress uses for regulating access to the lock table and the database buffer cache to an alternate algorithm. This
alternate algorithm may provide slightly lower performance.
In our ongoing commitment to providing quality service and support to our customers and partners, Progress will provide additional information as it becomes available.