Consultor Eletrônico

RAID descriptions definitions and recommendations

INTRODUCTION: x
=============

RAID is an acronym applied to disk technology. It originally stood
for Redundant Arrays of Inexpensive Disks, but its current usage is
accepted as Redundant Arrays of Independent Disks.

RAID is typically implemented to provide:

- Data reliability by replicating data so that is it not destroyed or
inaccessible if the disk on which it is stored fails.

- Improved I/O performance by balancing the I/O load across disks.

- Simplify storage management by treating several physical disks as
one virtual unit.

DEFINITIONS:
============

Here are some definitions to help understand the RAID concept.

Array: A collection of disks combined with some array management
software which presents these disks as one virtual disk.

Mirrored Array: Two or more member disks containing identical images
of data. Its I/O performance properties typically
include high reliability and availability. Its
aggregate capacity is equal to that of its smallest
member. Its performance is usually measurably better
than that of a single member for reads and slightly
lower for writes.

Striped Array: Two or more member disks on which data is interleaved
across all the members of the array. Its I/O
performance properties typically include high read and
write performance, but is less reliable that its least
reliable member.

EXAMPLES:
=========

Mirrored Disk Array Striped Disk Array
_______ _______
(_VDSK1_) (_VDSK2_)
/ \ \ \ \ ________
/ \ \ \ (_SDSK2a_)
________ ________ \ \ ________
(_MDSK1a_) (_MDSK1b_) \ -(_SDSK2b_)
\ ________
--(_SDSK2c_)

In the mirror example above, (sometimes called a 'shadow' set) when a
write operation is performed to VDSK1, it is actually made to both
MDSK1a and MDSK1b. However, since both disks contain the same data,
when a read operation is performed, it can be done from either disk.
If either MDSK1a or b fails, then the other disk still contains
accessible data.

In the striped example above, when a write operation is performed to
VDSK2, then the 1st piece of that data is written to SDSK2a, then the
next piece is written to 2b, then to 2c, and if there is more data,
then it gets written to 2a, etc. When the stripe set is created, the
size of these pieces is specified. Since potentially every disk must
be accessed to either read or write data, there is no performance
gain on read I/O. If any disk fails in the stripe set, then the data
will not be accessible.

Many systems combine both concepts and have mirrored striped disks,
to obtain both high read and write performance along with data
reliability.

RAID LEVELS:
============

A RAID array is a disk array in which part of the storage capacity is
used to store redundant information about the data stored on the
remainder of the storage capacity. This redundant information
enables the 'regeneration' of data in the even that one of the
arrays member disks or the access path to that disk fails.

There are currently 5 RAID levels or types. The numbering labels are
in no way meant to indicate any hierarchical relationship. In other
words, RAID 5 is not 5 times better than RAID 1. Nor is it
necessarily better than RAID 1. Each RAID level has its own unique
operational and performance characteristics. Three of these levels
have proven to be commercially attractive. These are 1,3 and 5.

Note: Although the term RAID 0 is often used to refer to disk
striping (because its striping concept is similar to that used in
RAID strategies) it is not technically a true RAID level, because it
does not provide for data redundancy.

There is a RAID 6 which has also been mentioned in use in some
Research and Development and Educational centers, however it is not
yet in commercial use. (RAID 6 provides for data redundancy if no
more than any two members of the array fail. Current RAID levels
only allow for one member to fail.)

RAID Level 1 is disk mirroring. It is considered very reliable, but
more expensive than other RAID levels, due to the inherent cost (you
have to have at least 2 of every disk.)

RAID Level 3 uses a parity disk to store redundant information about
the data stored on other disks in the array. This parity disk is a
separate dedicated disk. If this disk fails, then you no longer have
data redundancy.

RAID Level 5 uses level 3's storage algorithm, however it stores the
parity information across all members of the array. Level 5 offers
the data reliability approaching that of mirroring, with the read I/O
performance of striping. However, there is a substantial performance
penalty when write I/O operations are done.

Because of this substantial performance impact that write operations
have in a RAID level 5 implementation, this level is not recommended
for use with PROGRESS. It is especially not recommended for disks
where the .bi file (or .ai file if after imaging is enabled.) is
located.

RAID level 1 typically provides the best performance and data
reliability for use with PROGRESS.

REFERENCES TO WRITTEN DOCUMENTATION:
====================================

'A Case for Redundant Arrays of inexpensive Disks (RAID)',
Patterson, Gibson, Katz, 1988, ACM SIGMOD Conference.

'The RAIDbook, A Source Book for Disk Array Technology',
RAB (RAID Advisory Board), 4th edition, 1994.

Progress Software Technical Support Note # 15056