Kbase 20688: How Does -directio Improve Performance?
Autor |
  Progress Software Corporation - Progress |
Acesso |
  Público |
Publicação |
  08/06/2010 |
|
Status: Verified
GOAL:
How Does -directio Improve Performance?
GOAL:
What is -directio?
GOAL:
How is -directio implemented?
GOAL:
How to use the -directio Database Startup Parameter
GOAL:
Direct io explained
FACT(s) (Environment):
All Supported Operating Systems
Progress 8.x
Progress 9.x
FIX:
The Progress RDBMS for UNIX has had two different implementations of the -directio parameter. The original implementation was used in Version 6.3 and Version 7.X and is now only of historical interest. Version 8 and later uses a new implementation, which is described here.
Using the -directio option may improve your database performance through more effective regulation of the disk write workload of a UNIX system hosting a Progress database server.
Background: Database I/O
In Version 8 through 9.1B, the default method the Progress database storage manager uses to perform random-access database reads and writes on UNIX systems is to use buffered i/o via either the lseek()/read() and lseek()/write() system calls, in Version 9.1C and later, the default method is pread64() and pwrite64() system calls, combined with a sync() system call at the end of each checkpoint to ensure that data would eventually be forced to disk. These system calls use the operating system's file buffer cache when possible.
The overall disk write workload from database writes will sometimes become quite uneven or "bursty", especially when the database update workload is heavy. Database blocks written by the asynchornous page writers are not
actually written to disk when the page writer issues a write() or pwrite64() system call. Instead database blocks are copied by the operating system into the filesystem cache in memory. They are usually written to disk some
time later, when the filesystem decides to do so. The filesystem may delay actually writing the database blocks to disk until it needs to make room for reading in a new disk block or until after the next sync() call. So all the careful work the page writers do to plan their activities to smooth out database writes is wasted.
Database I/O With -directio - Theory
When the -directio server startup parameter is specified, the database storage manager uses a different method for writing database blocks. This method is called "synchronous write" and is activated by specifying the O_SYNC (or D_SYNC if available) option when the database files are opened with the open() system call. Reading and writing the database is performed with the pread64() (or read()) and pwrite64 (or write()) system calls.
In this mode, all database I/O operations will still use the filesystem buffers, but writes are handled in a different manner than without the -directio option in effect. A pwrite64() (or write()) system call does not complete until after the data have been transferred to the disk by the operating system's device drivers. As a result, writes take longer and the page writers take longer to do their job, but the overall disk write workload should be more evenly distributed and have fewer spikes.
When using synchronous writes, the storage manager does not need to use the sync() system calls at the end of each checkpoint. In modern UNIX systems, this can be quite important. sync() calls can be quite expensive because modern systems often have large amounts of memory and multiple disk drives, and they often have very large filesystem buffer caches. When the storage manager makes a sync() call, the operating system writes all modified file pages to disk. This includes all database pages present in the filesystem cache as well as other pages. Flushing the filesystem cache can be quite time consuming and can cause noticeable delays in system and database activity.
By using the -directio option, one gains the following beneficial effects:
* Expensive sync() calls are eliminated, along with unnecessary i/o caused by flushing data that has nothing to do with the database.
* Overall disk write scheduling is more even because writes occur when the page writers want them to, and they try to organize their activities to provide as even a write rate as they can.
To get these benefits, you will need more page writers than when you do not use t.he -direcio option. A "rule of thumb" is to use one page writer per disk that contains database data files, and one extra. You may need more or
less, depending on your system, number of users, and application. A system with a light update workload (one in which the application does not update the database very much) will need fewer page writers because fewer database
writes need to be done.
If you use -directio without increasing the number of page writers, or you do not use page any writers, you will probably see a decrease in overall performance.
Database I/O With -directio - Practice
The previous section describes how -directio works in theory. In theory, there is no difference between theory and practice, but in practice, there is.
Along with the advantages, there are a few disadvantages as well. When -directio is not being used, the filesystem schedules write operations at times of its choosing and also tries to coalesce writes to adjacent
filesystem pages when possible. This coalescing can sometimes greatly reduce the number of disk seeks and disk writes. When -directio is used, the filesystem's write coalescing is largely eliminated for database writes
and this may result in lower performance.
The -directio option gives different results with different versions of the UNIX operating system and with different filesystem, and occasionally with different releases of the same operating system. The -directio option is
not suitable in all cases. In some cases, there is no benefit, and in others you may observe severe performance degradation.
In particular, note the following:
On AIX systems (release 4.3 and later), -directio has been beneficial in many situations and has not been known to cause problems.
On Linux RedHat systems up to RedHat 8, -directio often provides no benefit.
At this time (July 2003), the use of -directio on HP-UX is NOT recommended. On HP-UX systems release 11.0 and later, customers have experienced a variety of problems, caused by defects in the implementation of the pread64() and pwrite64() system calls. There are several patches available from HP to correct these problems. The effect of these defects is a severe degradation in write performance, even when -directio is not being used.
While the -directio option can be very beneficial, it is not always. In all cases, you should implement use of the -directio option only after performing tests to determine whether it is helpful in your environment..