Consultor Eletrônico

Status: Verified

GOAL:

Does Progress use Synchronous or Asynchronous I/O?

GOAL:

Does Progress use buffered or unbuffered I/O?

GOAL:

Does Progress use aio_read() or aio_write() system calls?

FACT(s) (Environment):

All Supported Operating Systems
Progress/OpenEdge Product Family

FIX:

Asynch I/O means issuing I/O then wait for them to complete or issuing an I/O without waiting and being allowed to perform some other processing and then ask or wait for the previously issued I/O completion. There are even ways for one thread to wait on another thread's I/O completion. The actual way the Asynch I/O is performed is platform depending.
Synchronous I/O (direct I/O) means wait for each I/O to complete.
Progress does not do any Async I/O per se. Progress does not make use of aio_read() or aio_write() system calls. However, you could consider Progress as implementing its own asynch I/O "processing" through its file (ai, bi and db) buffering and I/O by page writers while the process that makes a change to a block in a buffer does not actually perform the I/O, it continues its processing while the I/O is waiting in a buffer pool (-aibufs, -bibufs or -B) to be written out by another process (the aiw, biw, apw).
Progress uses 2 flavors of synchronous I/O, buffered synchronous I/O and unbuffered synchronous I/O (direct I/O). Some consider buffered I/O to be asynchronous but it truly is not when asynch I/O is defined as allowing other operations to execute by the same thread before an issued I/O "completes".

To guarantee crash recovery, we follow a write ahead logging rule which basically states that we will write to the bi file first and then to the data files. To follow this protocol, not only do we need to issue the writes in the proper order, but we must ensure that the physical changes to the disk are done in the proper order.
Therefore, our current implementation of I/O is as follows:
We do unbuffered synchronous I/O (O_RDWR|O_DSYNC) to the bi (crash recovery guaranteed)
(unless running with -i (no-integrity) -R (non-raw) startup flags.)

We do buffered synchronous I/O (O_RDWR) to the bi file if running with the -R startup flag. This allows rollback and possibly crash recovery. If the database were to crash, crash recovery should work as all the data we need should still be in the file system's cache if it hasn't made it to physical disk yet. However, if the system were to crash, it is NOT guaranteed that crash recovery will work - it most likely will not.
We do buffered synchronous I/O (O_RDWR) to the bi file and NO bi/data order synchronization if running with the -i startup flag (-i forces -R). This means that rollback should work, but if the database were to crash (or a user get killed), even if the system were not to crash, there is no guarantee that crash recovery will complete. In this case, I think we even prevent you from connecting to the database - we don't even try to recover.
We do unbuffered synchronous I/O (O_RDWR|O_DSYNC) to the data files if running with -directio startup flag. This means direct writes to disk for data files and allows us to avoid needing fdatasync() calls at checkpoints. Some customer see improvements running this way, some don't. It all depends on the application. The idea is that since Progress maintains a buffer pool, why do we need to use the file system's "buffer pool" as well? This means we have a cache on top of a cache. There are cases where double caching is better but I'll defer that discussion.
We do buffered synchronous I/O (O_RDWR)to the datafiles if not running with -directio which means that we must issue an fdatasync() call when we want to guarantee that the data I/O has made it out of the file system's cache and actually been written to disk (usually only at checkpoint time).
The difference between buffered and unbuffered synchronous I/O is that an unbuffered synchronous I/O is guaranteed to be on disk when returned from the write() system call. A buffered synchronous I/O is guaranteed to exist, either in the file system's cache OR possibly on disk when returned from the write() system call.