Consultor Eletrônico

Why Two-Phase Commit Afffects Database Performance

Two-phase commit is used to insure database integrity when updating
two or more databases with a single transaction. This much we know.
We have traditionally seen peformance degredation anywhere from 20% to
50%, and these numbers are not etched in stone hence our disclaimer
"your mileage may vary". The mere concept of writing to two or more
files simultaneously with the same transaction in itself is enough to
pin the performance hit on. However, the internal mechanics of how
this process works are both interesting and enlightening.

Believe it or not, the majority of the performance impact of two-phase
occurs within the bi file. One of the most expensive elements of the
bi's default behavior is how it handles transition of a transcaction's
commit note from the buffer in which it first appears to the file
where it will ultimately reside. By default in a normal database
without two phase enabled, this is also a problem. Before we explain
the workaround, lets get a little deeper into the problem.

Typically the buffer that recieves the end note to a transaction needs
to be written to the bi file as soon as the end note is recieved.This
means that we are subject to 1 i/o operation every time an end note is
written to bi buffer. In an update intensive environment, you can
imagine how many of these end notes we will encounter during a typical
day of processing, and executing an i/o operation for each one of them
would logically degredate performance. Since increasing the scope of
the transactions was an anything but desireable workaround, we
introduced the -Mf startup parameter.

The -Mf statup parameter basically helps us deal better with the need
to write out the end note buffer as soon as the end note is written.
Writing the buffer out immediately whether it is full or not is not
only one extra write operation per transaction, but it is also an
inefficient one as well. The -Mf lets us define a number of seconds
which we will allow the transaction end note to sit in the bi buffer
while we allow more writes to the same buffer. We are no longer
obligated to write that buffer at the instant the end note is written
so we are doing one fewer and more efficient i/o operation. This is a
good thing. This parameter defaults to 0 and has maximum value that is
much higher than you'll ever need (32,768).

NOTE: The concept of writing the buffer out as soon as the end note is
recieved was actually a noble idea. It minimized the chance of losing
a transaction if the system crashed while the transaction end note was
in the buffer and not on disk. However, the added i/o has caused this
to become a potential bottleneck on some systems hence the -Mf
parameter.

Your tool for guaging whether -Mf is doing its job is partial writes
to the bi file. If the partial writes are 50% or greater, then you
probably need to increase the value of -Mf.

Now you may be asking yourself if all of this explanation on -Mf was
necessary for two-phase and the answer is absolutely. When two-phase
commit was first written, it eliminated the ability to use -Mf with
distributed transactions (to more than one db at one time). This meant
that not only were we performing the write every time we got an end
note, but now it was to multilple files instead of one. As you can
imagine, the potential performance impact could be severe, and it many
cases will be, especially if i/o is the bottleneck in the first place.
If cpu is the bottleneck, then this shouldn't make it any worse, but
that problem should be addressed as well.

Now that we have more than one inefficient write operation for every
transaction end note, what did we do about it? The answer is the
-groupdelay startup option. Conceptually it is very similar to the -Mf
startup, only this is meant only for databases running two-phase
commit. THis parameter like the -Mf allows the end note to remain in
the bi buffer for a set period before the buffer is written out thus
increasing the efficiency of the write. Again, since we are writing
to multiple databases, the gain here may not be directly proportional
to that of the -Mf with one db.

The difference with -groupdelay vs -Mf is that the former is set in
milliseconds. The default is 10 and we don't recommend setting it any
higher than 500. Again, as with -Mf your guage is partial writes to
the bi file. If they are 50% or more then you would need to increase
the value. If there is already contenion in the bi buffer pool then
you may see buffer waits increase as well when not using either of the
bi efficiency increasing parameters. As always when running with
two-phase commit, your mileage may vary.

SDA 6/8/98