Consultor Eletrônico

Status: Verified

SYMPTOM(s):

Ongoing index corruption causing database crashes in Progress 9

<func-name>: Error occurred in area <num>, block number: <num>, extent<name>: . (10560)

bmLockBuffer: Error occurred in area <num>, block number: <num>, extent: <name> . (10560)

SYSTEM ERROR: Database block <nbr> has incorrect recid: <nbr>. (355)

Corrupt block detected when attempting to release a buffer. (4232)

bmReleaseBuffer: Error occurred in area <num>, block number: <num>, extent: <name> (10560)

Writing block <num> to log file. Please save and send the log file to Progress Software Corp. for investigation. (10561)

SYSTEM ERROR: Memory violation. (49)

Machine is SMP multi-processor, Pentium Xeon.

2 GB or more memory available on system

Server architecture entirely addresses >= 2GB

Idxcheck operations detect index corruption where other index utilities do not necessarily

Index corruption accelerates over time culminating in database crash situation

Compaq ML530 and Compaq Proliant ML750 machines with write back cache

-spin and -B have been sufficiently tuned

FACT(s) (Environment):

Progress 9.1D
Progress 9.1C
SCO UnixWare 7.1.1
SCO UnixWare 7.1.3

CAUSE:

BUG# 20031029-019

CAUSE:

This is an OS issue (not Progress) with the memove() call. When an overlapping memmove() call whose boundary crosses the 2 Gig (or pos neg) range, this causes "byte shift" corruption at block level. Later, when we attempt to add addional entries to the index our calculations result in garbage (negative lengths) which then causes other blocks in our buffer pool to get overwritten with garbage (because we are passing a negative length to memmove() which interprets the length as an unsigned value and therefore a very large value). The real problem however is that the previous memmove() corrupted the block.

Because memory allocation does not need to be contiguous, in an extreme case, we could have been needing (say) only 8KB of memory and "struck the un-lucky jackpot" by being given the 2GB memory address pointer to run "that" memory address from in the first place. This is why this issue only relates to a server architecture that entirely addresses >=2GB, in order to exceed the boundary in the first place.

FIX:

Apply Service Pack 9.1D08 or later which addresses this memove() OS function by avoiding using a block in the buffer pool if its address spans the 2 gig limit.