RAID storage systems

What is a RAID?

A redundant array of independent (inexpensive) disks or RAID uses multiple "smaller" disks that function as one large drive, and provide for data recovery if a single drive files in most cases. When given the task of maintaining large amounts of high availability storage there are different approaches for attaining this goal, use RAID technology or use single large expensive disks (SLEDs). RAID won't solve all your problems with disks and file systems but it does have strong advantages over maintaining single large expensive disks (SLEDs).

RAID technology has been developed to address three areas of disk storage:

large capacities
increase input/output performance
reliability through redundancy

A RAID device can be configured to act like a large, single logical drive. This is usually done with the aid of specialized RAID software or hardware, such as a RAID controller. The RAID mechinism acts as an intermediary between the multiple disk drives and the operation system. Because the RAID mechanism allows simultaneous read/writes to all the drives in the array and sometimes uses memory buffering to I/O requests, an overall increase in the I/O performance for read/write operations is associated with RAID technology. The RAID mechanism can be configured to provide data redundancy through mirroring, which is storing two copies of the original data, or through a scheme which uses "parity drives". Parity striping stores multiple copies of data across the multiple drives, in this way the failure of one drive will not result in any loss of data.

The RAID Levels

RAID 0: Data is striped across drives; no data redundancy is provided.
Multiple disks are used to improve performance, but there is not logic to protect/recover data. The performance gain is attained by using large blocks of data I/O and spreading the load across several disks.
RAID 1: Data redundancy is obtained by storing exact copies on mirrored pairs of drives.
Two or more copies of data are written to two or more different disks at the same time. Data may be read from either disk, based on device availability. Although reliability is high, so is the cost as twice the amount of disk storage must be purchased.
RAID 2: Data is striped at the bit level; multiple error-correcting disks provide redundancy; not a commercially implemented RAID level.
RAID 3: Data is striped at the byte level, and one drive is set aside for parity information. This does well for large files where large blocks size and sequential I/O are used. Since only one drive is used for parity, the cost is reduced from a RAID 1 implementation.
RAID 4: Data is striped in blocks, and one drive is set aside for parity information.
RAID 5: Data is striped in blocks, and parity information is rotated among all drives in the array.

Optional information about RAID is presented by Phase II Technical Sales, based on material supplied by Symbios Logic.

Why use RAID?

The most compelling reason for some system administrators to implement RAID is based on the cost of system outages. Nearly half of system outages are cause by failure related to disks and a system outage may cause cost, in terms of revenue lost or production downtime, as much as $78,000/hr.

Terms used: controller, striping.