Raid in Industrial Computer Systems

Introduction
Industrial RAID StorageThis Chassis Plans White Paper discusses RAID.  A Redundant Array of Independent Disks (RAID) is a collection of hard drives, one or more controller cards, and embedded software to increase the reliability and redundancy of data storage on hard drives. RAID comes in multiple flavours offering improved performance and/or improved data reliability. The RAID number, RAID-5 for example, is not representative of the number of drives involved. The most common RAID implementations are 0, -1, and 5. RAID can be implemented with and without the ability to hot swap a drive.

A variety of plug-in controllers are available from virtually all the drive controller manufacturers such as Adaptec and DPT which allow RAID implementation. These controllers all interface with SCSI drives and are available in ISA and PCI configurations. The PCI format will provide the highest throughput. RAID support for IDE drives is not generally available.

Any SCSI drive can be used with a RAID controller and different drive manufacturers, sizes and throughputs can be used on the same bus. You should check with the controller manufacturer for additional information. RAID controllers will also act as generic drive controllers interfacing to CD and tape drives as well as external accessories such as scanners.

RAID drives can be permanently mounted in a chassis, mounted in removable 5-1/4” carriers or mounted in external drive bays for easy accessibility and replacement in the case of a drive failure. Full RAID protection can be realized even in non-removable drive situations where the RAID system provides data protection and time to take the system off-line to replace a failed drive. This can certainly be a less expensive and potentially more reliable option in place of using expensive removable drive carriers. See Kingston Technology Data Express and JMR Wildcat for removable media.

RAID Level Definitions

image RAID 0 Striping

The data is written across multiple drives to improve access performance. There is no data redundancy. For example, a 4Meg file would be written across 4 drives in 1Meg pieces. Note that the failure of one drive will render the data inaccessible. The advantage is much higher throughput.

RAID 1 Mirroring.

image

Provides 100% redundancy providing an exact copy or mirror of the primary drive. Should one drive fail, the data will be completely accessible on the other drive. There is no performance improvement unless simultaneous reads are allowed. Note that twice as many drives must be purchased. One controller can provide mirroring across one bus or two controllers can be used to provide controller redundancy as well as drive redundancy.

Adaptec provides a very extensive online discussion on RAID and their controllers in particular in their Array Guide.

RAID 10 or 0/1 Striping and Mirroring

image

A combination of RAID 0 and 1. The data is split across multiple drives for improved performance and each drive is mirrored for redundancy. Note that twice as many drives must be purchased.

RAID 2

A proprietary array patented by Thinking Machines, Inc. where the data is split on a bit level among several drives with additional drives providing parity information. Requires large numbers of drives. Not generally implemented.

RAID 3 Striping with Parity

Provides redundancy with improved performance. The data is shared across multiple drives with and additional drive providing parity information. The data striping improves performance but requires simultaneous reads as the array is accessed. The drive with the parity information can be used to reconstruct the data should one of the data disks fail. Usually used with 3 data drives and 1 parity drive. Small random writes are generally slow as the parity drive must be accessed for each write.

RAID 4 Striping with dedicated parity disk

Similar to RAID 3 except larger data blocks are striped and does not require the participation of each drive for each access. The parity drive is accessed for each data access.

RAID 5 Striping and Parity

image

The most common RAID implementation. Both the data and parity information is striped across multiple drives with each drive holding both data and parity information. Should any one drive fail, the remaining drives contain sufficient information to allow recovery. Provides complete redundancy with improved performance. The smallest RAID 5 implementation requires three drives though more can be used for improved performance.

RAID 6

No real definition and can mean different things to different vendors.

RAID 7

Proprietary to Storage Technology, Inc. and is similar to RAID 4 with caching and a proprietary operating system to run the array.

General RAID Related Definitions

Hot Swapping refers to the ability to remove a drive from an array while the system is powered-up. This typically requires the power connector pins on the drive tray be longer than the signal pins so that the signals are first disconnected then the power to prevent data glitches on the data bus. There are a variety of removable drive carriers and it is important to assure they support true hot swapping, and not just removable media.

Warm swappingcan be used to stop drive access while a drive is removed from the array. This is typically a software function or ‘button’ to suspend drive activity. A low cost removable drive carrier without hot swap can be used in this configuration.

Hot spare provides a back-up drive in the array that will automatically come on-line in the event of a failure of one of the other drives. Typically an array can only tolerate a single drive failure without data loss so a hot spare drive reduces this window of opportunity for total failure.

SMART (Self-Monitoring, Analysis, and Reporting Technology) is a predictive failure analysis system where the drive performs self analysis and can communicate predicted failures to the controller. This allows early replacement of possibly faulty drives before actual drive failure.

Dynamic Sector Repair allows a RAID system to locate faulty sectors on drives, transparently repair the data and flag the sectors as bad to prevent future access.