Você está na página 1de 25

Redundant Array

of Inexpensive Disks:

Data Protection and Downtime Elimination

Christopher Lambert

December 2, 2000

P. Alexander

Management Information Systems

University of North Alabama


As information technology shifts into the future, one

aspect continues to remain the same: the importance and

value of data. Unlike hardware and software components,

datum cannot be easily replaced; therefore, for a business

to be successful, measures to adequately protect data are

essential. It is the duty of IT management to instigate

procedures for protecting the business’ data. Because

hardware and software failures are imminent and arbitrary,

in the past, information technology departments have made

due by implementing simple backup activities into daily

standard operating procedures. However, solely backing

data up onto tapes has become an insufficient means of data

protection since restoration requires downtime of

information systems, and downtime can be devastating to a

business depending upon the importance of the inaccessible

data. It has been stated “94% of businesses that had

suffered a catastrophic non-recoverable failure in their

corporate IT storage systems went out of business within 2

years.” In attempt to counter these unpredictable and

serious situations, RAID solutions should be implemented in

addition to periodic backups.

RAID is an acronym that stands for redundant array of

inexpensive disks. David A. Patterson, Garth Gibson, and

Randy H. Katz are credited with theorizing this technology

2
in 1987. RAID not only offers protection of data, but,

also, grants businesses higher levels of data integrity and

hardware/software fault tolerance. The benefits from using

RAID include increased system uptime and system

performance, two extremely important issues for IT

managers, which cannot be ignored in today’s business

place. However, the benefits from RAID extend far beyond

these two issues because of the impact it has on the entire

business and, more directly, the faculty of IT departments.

Because of increased system uptime, IT employees can

utilize their, sometimes expensive, time to other important

business issues, and, when supporting the business 24 hours

a day by being on call, data loss crises are minimized and

so is after hours support. Furthermore, businesses in

continuous operation are not obliged to schedule downtime

to restore or backup data when a RAID technology is

implemented. Since the utilization of a RAID technology is

extremely advantageous to a business with great importance

on their data, the question of whether or not to implement

the technology is easy to answer. That being obvious, the

next question to be answered is which of the many levels

and variations of RAID is suitable for the business.

There are many different applications of RAID, all

with different levels of protection and with a different

3
focus on the protection of the information system. The

most basic version, known as RAID 0, and more advanced

versions, such as RAID 5 and 7, may be implemented

depending upon the level and focus of protection required

by the business. Each level offers different performance,

fault tolerance, and cost. Managers must be aware of each

type of RAID, the advantages and disadvantages of each, and

be able to make an educated decision on which one best

suits their business. In addition to the different levels

of RAID, there are software and hardware versions of most

of them. The variations between the two are very different

and these options should also be taken into consideration.

Nonetheless, the first decision should be to determine

which level, or levels, would be the best practice for the

organization.

RAID 0 was created in the early advancements of the

RAID technology. This level is also known as striping.

Taking more than two disk drives, preferably five, and

striping them together to create one virtual disk will

accomplish this level of RAID. Data is then written to

what is known as the stripe set and is spanned across the

volume, where each drive operates parallel of the others.

RAID 0 is commonly used in environments where files are

large and the data is sequential. The benefits of RAID 0

4
are fairly straightforward. Data access performance is

increased because data request queues are shortened for

each disk drive. Disk utilization is decreased because

there are more drives to help take on the load of data

access. This is achieved by writing data sequentially

across the drive set so the data can later be retrieved by

each drive simultaneously.

However, the increased performance of RAID 0 only applies

to applications using sequential access because it involves

no indexing of the data. Furthermore, striping the drives

together does nothing to protect the information stored on

the drives; therefore there is no data redundancy. In

spite of this, RAID 0 can be combined with other levels of

RAID to not only increase performance, but also employ data

redundancy and fault tolerance.

RAID 1 encompasses the potential for data redundancy

and is commonly known as mirroring, which is the response

for the reliability issues of RAID 0. In lieu of writing

the data across the set of drives, as in RAID 0, mirroring

duplicates the data across the set. For example, in the

most simple of cases, a system may have two hard disk

5
drives operating on the same controller. The same data

written to disk 0 would we simultaneously written to disk

1.

The RAID 1 scenario grants the user data protection, in

that when one drive fails, there is a replica, which can be

immediately brought online, depending upon the

sophistication of the environment, to eliminate any

downtime. Additionally, the failed disk drive can be

replaced during a more convenient time. Common uses for

RAID 1 include very sensitive data or data that is

mandatory for a system to operate, such as the boot drive,

and where data is not sequential. Because data is written

twice as often with RAID 1, it may seem that writes to the

drive set would take twice as long, but this is a myth. In

opposition, writes to a mirrored set generally take only

15% to 20% longer than writes to a single member. Some

write performance to the mirrored array may be lost;

however, as in RAID 0, lowering disk utilization increases

performance. One other fallback to implementing RAID 1 is

the higher costs it demands, since disk drive requirements

6
double. Implementing RAID 1 and RAID 0 is a fairly simple

task, but they only lay the groundwork for the absolute

potential of RAID.

Within the next couple variations of the RAID

technology, specifically RAID 3 and RAID 5, a new concept

is introduced known as parity. RAID 3 uses the same theory

of RAID 0, but adds an extra drive to the array, which

maintains parity information about the data in the stripe

set. It divides the data across the stripe as in RAID 0,

and extra information is written to the additional disk in

corresponding blocks, which is the computed parity for the

data blocks residing on each of members of the stripe unit.

Given this parity information and all but one of the blocks

of data, the destroyed or failed drive can be re-computed

or derived. This technology adds fault tolerance to the

stripe set, which is the “ability of a system to continue

to perform reads and writes in the event of a hard disk

failure.” Although this protection is not as great as

having a full mirror of the data, it does reduce the amount

of expensive downtime. RAID 3 is frequently used in

situations where large amounts of data are accessed

sequentially. On the other hand, this RAID level does not

work well with database management systems since they

usually exercise random access. The reason RAID 3 and RAID

7
0 operate more efficiently in circumstances where large

quantities of data are being read is because of the

physical arrangement of the drive set. Every write to a

drive in these types of RAID require a write to the parity

drive; therefore, seek time is maximized when large amounts

of data are being requested and is minimized when small

amounts are being requested.

RAID 5, on the other hand, solves the problem RAID 3

has with over utilizing the parity drive. In this level of

RAID, sometimes referred to as rotated parity, the parity

information is shared across the stripe set in consecutive,

yet different, locations. By doing this, the parity and

the data functions are shared by each member in the set.

RAID 5 performs just as well as RAID 3 when it comes to

sequential reads, and RAID 5, also, outperforms the random

read performance of RAID 0. Furthermore, the write

performance suffers because RAID 5 adds some data

integrity, or the ability to ensure data is written

correctly, into data management. This data integrity is

accomplished by a series of short steps. First, the

members of the drive array are read and the parity

8
information is computed. In this step, each member disk is

read in parallel. After the new parity block is figured,

the data, parity, and block identities are written to a

log, which is completed in one input/output operation. The

data and parity information is then written to the member

disks in parallel after the log has been updated in case of

a catastrophe, such as a power outage. Finally, the data

associated with the write operation is removed from the

log. This process is often compared to the commit function

of a DMBS. If a disaster occurs during the write of the

data, information exists in the log to ensure the data was,

or was not, written correctly, thus the term data

integrity.

This protection in RAID 5, however, does not come

without a price. After the read for the parity

information, the parity computation, the two writes for the

two-phase commit log, the write for the parity, and one or

two writes for the data members, the write performance is

greatly diminished, predictably about 60%. The benefit for

trading off this write performance is data integrity, read

performance, and data protection. If one of the drives in

the RAID 5 configuration fails, the missing data can be

assimilated on the fly. This is known as degraded mode.

For every read and every write in degraded mode, each disk

9
drive in the array must be accessed to compute the missing

data. Depending upon the size of the array, this can

result in an astounding amount of overhead. Therefore, it

is a common practice to limit the number of drives in a

RAID 5 arrangement to six (6) drives to safeguard

performance during degraded mode. RAID 5 is one of the

more confusing and complex levels of RAID, yet it is still

the most common and works well in most environments.

Still, other options exist if the previous, most common,

levels of RAID do not suffice the needs of the organization.

RAID 7 is a fairly unconventional level of RAID that

has been copyrighted by Storage Computer Corporation. This

level of RAID, although proprietary, is used significantly

in the market and is worthy of being explained along side

the others. RAID 7 takes advantage of the framework of

RAID 3 and RAID 4, which is not much different than RAID 3,

but greatly improves on their shortcomings. The greatest

difference with RAID 7 is its heavy use of cache, or a

technique to buffer data in attempt to supply a provisional

storage area that will allow a faster disk drive to operate

without being hindered by a slower device. Through the use

of large amounts of cache, RAID 7 allows many, functions to

be performed simultaneously greatly improving performance

while continuing to support fault tolerance. As in RAID 3

10
and 4, RAID 7 has a dedicated parity drive, yet does not

suffer from the same dilemma as the other levels using a

dedicated drive for correspondence because of the

asynchronous I/O transfers it supports. It has been

reported that RAID 7 performance is 1.5 to 6 times better

than the other levels of RAID, and is write performance is

25% to 90% better than using a single member. The downsides

to RAID 7 include extremely high cost per megabyte of

storage, it is not user serviceable, and does not exploit

the two-phase commitment of RAID 5. Because of the high

cache usage of RAID 7, it is recommended to implement a

UPS, or uninterruptible power supply, as well. This is one

of the most expensive implementation of the RAID technology

and comparable results can be attained through the

implementation of a combined RAID technology, known as RAID

0+1, which will be discussed later.

Concluding the single technology RAID levels, RAID 10

combines high performance with high reliability. RAID 10

is a combination of RAID 1 and RAID 0, but is not the same

as RAID 0+1. In this scenario, two RAID 1 arrays are

striped. This level of RAID offers the same protection as

RAID 1; however, striping the array boosts performance.

Furthermore, under some circumstances, RAID 10 is known to

maintain uptime in the event of multiple drive failures.

11
Nonetheless, RAID 10 carries with it the same expensive

quality as RAID 1 because the number of disk double.

Because it stripes two RAID 1 arrays, the four (4) drives

is the minimum needed to implement this level. Ultimately,

the actual data space available is actually 25% of the

total drive space. To avoid the high costs of the upper,

more complex levels of RAID, the IT department may opt to

simply combine the lower levels of RAID, as in RAID 0+1.

RAID 0+1 is very similar to RAID 10, but is the direct

opposite. Instead of striping the mirrored arrays as in

RAID 10, RAID 0+1 mirrors two striped sets. In this

configuration, there are actually four RAID 0 arrays. Two

of the arrays are striped, and then they are mirrored.

12
In contrast to RAID 10, this level has the same fault

tolerance as RAID 5, but has higher I/O rates as a result

of the multiple stripe sets. However, if one drive fails

in either set, this configuration will, in essence, break

the mirror capability and become a RAID 0 array, which only

supports striping. This RAID solution is excellent for

organizations that need higher performance than RAID 5, but

do not need the extended reliability.

It is fairly obvious many variations and

configurations for RAID exist. In fact, there are other

levels not mentioned here, such as RAID 53, a combination

of, surprisingly, 0 and 3, and RAID 6, 5+1, 1+5, 5+0, and

0+5, all of which have specific advantages and

disadvantages, however rare they may be. Making the

determination of which variation to implement is only the

first decision for the management in putting RAID into

action. Following are some of the other items of interest

managers should observe when implementing RAID, as they are

important components and technologies.

Merely implementing a RAID technology does not

eliminate the fact drives will fail even though it does

reduce the pains of data recovery. To further address the

issue of device MTBF, or mean-time-between-failures, hot-

swapping has became a favorable practice. This technology,

13
also known as hot spares, is quite impressive. In addition

to the drives in the drive array, additional spare disks

are attached to the system, albeit inactive, waiting for

one of the active drives to fail. The spare drive, in the

event of a failure and given the environment supports some

type of autorecovery, will immediately take the place of

the seized disk. In the event the environment does not

support autorecovery, some employee intervention would be

required, but the same result of instant repair is

achieved. These spares, in some cases known as a pool, can

either be dedicated or non-dedicated, meaning their role

has either been pre-defined or their use is left up to the

system in the event of a failure. This technology can,

fundamentally, be applied to other components in the

storage system and, in this sense, is known as duplexing.

Duplexing involves adding redundant pieces of

equipment outside the RAID array. More specifically, it is

common practice and is recommended to duplex array

controllers, or adapters. These components are the array’s

interface to the I/O structure of the system. If a system

is only configured with one controller and it expires, the

efforts to maintain high data availability are in vain.

This is not acceptable in the IT industry. Duplexing the

array controllers counteract this possible disaster, and,

14
in addition, increase the system performance by expanding

the bus for data input/output. The term duplexing

generally refers to array adapters, yet can be applied to

any component in the system, including the system itself!

Other common components to duplex include fans, power

supplies, and processors.

Adding these extra, but highly important, devices can,

in some cases, as much as double to total cost of

implementing a RAID solution. However, without adapter

duplexing and making hot spare drives available, the RAID

solution selected is still weak to failure. If they cannot

be afforded, though, a basic RAID implementation is better

than no performance boost or data protection at all. That

being said, there is one more decision must be made before

installing a RAID solution, which relates to overall cost,

future total cost of ownership, and data availability, and

that is whether to install a hardware or software based

solution.

Hardware based RAID solutions always offer greater

data availability and serviceability over software based

solutions. When protection of data and data availability

is imperative, a hardware-based solution is the only viable

choice. These type of solutions usually have the ability

to detect more bit errors than software solutions, thus

15
increasing the systems data integrity, which is important

to most all organizations. Furthermore, hardware based

solutions typically offer more robust fault tolerance

measures. It is not uncommon for a hardware-based solution

to come standard with hot spare disk pools and duplexed

controllers. Another major advantage hardware solutions

have over software is their ability to take advantage of

bootable arrays. Not necessarily the most important part

of a storage system, it is an essential component, and

having the ability to stripe or mirror the boot drive is a

great advantage to uptime. Hardware solutions, more often

than not, are capable of automatically detecting a disk

failure. This automatic detection advantage can avert

hours of downtime, depending upon the time of the failure.

Also acquainted with hardware solutions are reduction in

CPU interrupts and in main PCI bus traffic, which in turn

grants over all better system performance. Hardware based

solutions, unexpectedly, have a lower total cost of

ownership than software based solutions in spite of the

higher costs that accompany hardware RAID solutions.

Finally, if the management team select any RAID

configuration other than RAID 0 or RAID 1, a hardware-based

solution is the only choice, because of the extreme demand

levels 3, 5, 7, 10, and 0+1 require of a system.

16
If one of the lower levels of RAID suffice the needs

of the organization and the IT department, a software-based

solution may be appropriate. The major benefit of software

RAID solutions is their costs. In some cases, software-

based solutions are free, as with Windows NT Server and

some network operating systems. The low front-end costs

portray the low system performance and limited

functionalities of software-based solutions. Error

protection and bit error detection are performed by the

systems CPU, which takes processing power away from

business applications. Furthermore, software-based

solutions are not capable of correcting data errors.

Software RAID solutions can only detect errors. In order

to detect the bit errors, the software-based solution

relies upon the functionality of the adapter itself, thus

decreasing I/O performance. Because of the high level of

operator involvement, the total cost of ownership is

actually higher than that of hardware-based solutions, in

the long run.

One of the main considerations management should keep

in mind is the type of applications the RAID solution will

be supporting, I/O bound or CPU bound applications, for

example. As with hardware-based solutions, the number of

CPU interrupts is much less than that of software-based

17
solutions, which frees the CPU to perform computational-

intensive functions. In addition, by minimizing the I/O of

the PCI bus, other activities, such as network traffic, can

be processed much more efficiently. In a CPU bound

environment, a hardware-based solution is much more

appropriate than a software-based solution, because RAID 5

parity checks and secondary writes in RAID 1 are offloaded

onto a RAID coprocessor. Moreover, software based

solutions, such as those incorporated into the Windows NT

and Novell Netware operating systems, do not support the

advantage of setting the priority for drive spares which

speed up array reconstruction. With the determination

hardware-based solutions are more advantageous, although

more expensive, IT departments would be wise to implement

hardware RAID solutions over software solutions when

possible. As expected, there are many solutions available

to IT departments from many different vendors.

The two solutions to be compared and contrasted here

are Compaq’s RAID Array 4100 and Dell’s Powervault 650F.

Both of these systems are targeted toward medium sized

businesses with at least 400 employees. These two systems

interface into an existing network via the fiber channel

interface. This interface is fairly new technology designed

to overcome some difficulties with existing interfaces into

18
network storage devices. It operates at 100MB per second,

200MB per second given existing network full duplex

capacity, is designed on the SCSI framework by individuals

who knew the shortcomings of storage interfaces, is a

serial interface, is 2.5 times faster than the existing

UltraSCSI interface (40MB), and can be connected by either

twisted pair cable or fiber optics. The two systems,

furthermore, come with management software to ease the

process of installation and maintenance. The interface

required by both systems is PCI, yet an EISA interface is

available for the Compaq solution. Given these

similarities, there differences should be significantly

considered.

Compaq Computer Corporation’s RAID Array 4100 is the

newest model in the RAID Array family, replacing the RAID

Array 4000.

This system has 64MB total cache memory, which is comprised

of 16MB ECC protected read and 48MB battery assisted user

selectable read/write memory. This system supports up to

twelve (12) one inch Ultra2 universal drives with support

19
for both Wide-Ultra SCSI3 and Fast-Wide SCSI2 drive

interfaces. It supports RAID levels 0, 1, 4, and 5. In

regard to high availability features, the RAID Array 4100

supports hot-pluggable, redundant power supplies, redundant

fans, and hot-pluggable hard drives. This feature grants

the system operator the ability to swap failed drives at

the time of failure without having to schedule downtime.

Furthermore, the redundant, hot-pluggable power supplies

grants further protection against black outs and brownouts,

and, if one power supply fails, the other is more than

capable of keeping the system up until such time a new one

can be installed. There is a standard Compaq one-year

warranty accompanying this product; however, extended

support may be purchased. This solution is redundantly

supported by the following operating systems: Microsoft

Windows NT® 4.0, Windows NT® Enterprise Edition, and

Microsoft Cluster Server. In addition, Novell NetWare

versions 3.12 to 5.1, Novell’s IntraNetWare, SCO

OpenServer, UnixWare 2.1, UniWare 7, Banyan Vines 6.x and

7.x, OS/2 SMP 2.11, and the OS/2 Warp Server Family non-

redundantly support it.

20
Contrasting, Dell’s Powervault 650F, also, has many

impressive features.

The Powervault has a remarkable total cache memory capacity

of 512MB to support the 400Mhz processors onboard the RAID

controllers. It has the capacity for 10 fiber channel disk

drives, not to mention a pre-configured expansion unit

available for an additional 10 drives. Without the

expansion unit, the Powervault 650F has a maximum capacity

of 4Tb, or 4000 gigabytes. Drive form factors supported by

this system include not only the one-inch, but, also, the

1.6 inch variations. Data protection presented with this

system includes RAID levels 0, 1, 5, and 10. As with

Compaq’s RAID Array 4100, the Powervault 650F has hot

swappable drives, redundant, hot-swappable power supplies,

and redundant cooling fans. This Dell solution comes with

a limited three-year warranty and a one-year warranty for

parts replacement.

Overall, these two systems would make good solutions

for IT departments supporting sites with less than 400 end

users. The overall reliability of the two exceeds that of

most others reviewed; however, cost information was

21
unavailable for the Compaq RA4100 without a RFQ. The Dell

Powervault 650F does offer great scalability as far as

potential RAID levels and drive space expandability.

Likewise, the Compaq RA4100 is supported by many industry

standard operating systems. Truly, a sales representative

should be contacted and questioned before making a purchase

of this magnitude.

Summing up, RAID technology has proven its importance

to the information technology industry time and time again

since its theorization in 1987. The management of the

information systems staff should take the advantages of

each level of RAID to heart and consider the benefit of

adding the corresponding components to further increase the

uptime and protection of the data. Failing hardware will

be an issue to be dealt with for many years into the

future; therefore, RAID will continue to be a popular

option to offset the peril of MTBF.

22
Bibliography

Adaptec, Inc. “ABC’s of RAID.”

http://www.adaptec.com/products/guide/abcraid.html

November 20, 2000.

Adaptec, Inc. “Hardware v. Software RAID.”

http://www.adaptec.com/technology/whitepapers/raid_hw_

sw01.html

November 20, 2000.

Angel, Jonathan. Network Magazine. “Lesson 144: RAID.”

San Franciso. July 2000. Vol 15 Issue 7. Pg. 34.

Compaq Computer Corp Web Site. “Compaq RA4100.”

www.compaq.com November 30, 2000.

Dell, Inc. Web Site. “Powervault 650F”

http://support.dell.com/docs/systems/sjade/650F/5867c0

.pdf November 30, 2000.

Dell, Inc. Web Site. “RAID Technology.”

http://www.dell.com/us/en/biz/topics/vectors_1999-

raid.htm

November 30, 2000.

23
Grigonis, Richard. Computer Telephony. “American ProImage

‘RAIDs’ the Industry.” San Francisco. October 1999.

Vol 7 Issue 10. Pg. 141.

Patterson, David A. Gibson, Garth, Katz, Randy. “A Case for

Redundant Arrays of Inexpensive Disks.” Berkely. 1987.

Planet IT Web Site. “Right RAID For You.”

http://www.planetit.com/techcenters/docs/Storage/exper

t/PIT19990113S0013

November 28, 2000.

RAID7 Web Site. “RAID 7 Architecture.”

http://www.raid7.com/wp_raid7afa.html

November 19, 2000.

Rapaport, Lowell. Imaging & Document Solutions. “RAID

today…and tomorrow.” San Francisco. January 1999. Vol

8 Issue 1. Pg. 55.

Soran, Phil. Inform. “RAID is the answer for fast, secure

storage. Silver Spring. April 1999. Vol 13 Issue 4.

Pg. 8-9.

24
”Wong, Brian. “RAID: What does it mean to me?”

http://www.sunworld.com/sunworldonline/swol-09-

1995/swol-09-raid5_p.html.

Novemer 29, 2000.

Yager, Tom. Unix Review’s Performance Computing. “RAID!”

San Francisco. April 1999. Vol 17 Issue 4. Pg. 21-24.

25

Você também pode gostar