Você está na página 1de 8

RAID: Why and When

RAID stands for Redundant Array of Independent Disks (some are taught "Inexpensive" to
indicate that they are "normal" disks; historically there were internally redundant disks which
were very expensive; since those are no longer available the acronym has adapted).
At the most general level, a RAID is a group of disks that act on the same reads and writes. SCSI
IO is performed on a volume ("LUN"), and these are distributed to the underlying disks in a way
that introduces a performance increase and/or a redundancy increase. The performance
increase is a function of striping: data is spread across multiple disks to allow reads and writes to
use all the disks' IO queues simultaneously. Redundancy is a function of mirroring. Entire disks
can be kept as copies, or individual stripes can be written multiple times. Alternatively, in some
types of raid, instead of copying data bit for bit, redundancy is gained by creating special stripes
that contain parity information, which can be used to recreate any lost data in the event of a
hardware failure.
There are several configurations that provide different levels of these benefits, which are covered
here, and each one has a bias toward performance, or redundancy.
An important aspect in evaluating which RAID level will work for you depends on its advantages
and hardware requirements (E.g.: number of drives).
Another important aspect of most of these types of RAID (0,1,5) is that they do not ensure the
integrity of your data, because they are abstracted away from the actual data being stored. So
RAID does not protect against corrupted files. If a file is corrupted by any means, the corruption
will be mirrored or paritied and committed to the disk regardless. However, RAID-Z does claim to
provide file-level integrity of your data.

Direct attached RAID: Software and
Hardware
There are two layers at which RAID can be implemented on direct attached storage: hardware
and software. In true hardware RAID solutions, there is a dedicated hardware controller with a
processor dedicated to RAID calculations and processing. It also typically has a battery-backed
cache module so that data can be written to disk, even after a power failure. This helps to
eliminate inconsistencies when systems are not shut down cleanly. Generally speaking, good
hardware controllers are better performers than their software counterparts, but they also have a
substantial cost and increase complexity.
Software RAID typically does not require a controller, since it doesn't use a dedicated RAID
processor or a separate cache. Typically these operations are handled directly by the CPU. In
modern systems, these calculations consume minimal resources, though some marginal latency
is incurred. RAID is handled by either the OS directly, or by a faux controller in the case
of FakeRAID.
Generally speaking, if someone is going to choose software RAID, they should avoid FakeRAID
and use the OS-native package for their system such as Dynamic Disks in Windows,
mdadm/LVM in Linux, or ZFS in Solaris, FreeBSD, and other related distributions. FakeRAID use
a combination of hardware and software which results in the initial appearance of hardware
RAID, but the actual performance of software RAID. Additionally it is commonly extremely difficult
to move the array to another adapter (should the original fail).

Centralized Storage
The other place RAID is common is on centralized storage devices, usually called a SAN
(Storage Area Network) or a NAS (Network Attached Storage). These devices manage their own
storage and allow attached servers to access the storage in various fashions. Since multiple
workloads are contained on the same few disks, having a high level of redundancy is generally
desirable.
The main difference between a NAS and a SAN is block vs. file system level exports. A SAN
exports a whole "block device" such as a partition or logical volume (including those built on top
of a RAID array). Examples of SANs include Fibre Channel and iSCSI. A NAS exports a "file
system" such as a file or folder. Examples of NASs include CIFS/SMB (Windows file sharing)
and NFS.

RAID 0
Good when: Speed at all costs!
Bad when: You care about your data.
RAID0 (aka Striping) is sometimes referred to as "the amount of data you will have left when a
drive fails". It really runs against the grain of "RAID", where the "R" stands for "Redundant".
RAID0 takes your block of data, splits it up into as many pieces as you have disks (2 disks 2
pieces, 3 disks 3 pieces) and then writes each piece of the data to a seperate disk.
This means that a single disk failure destroys the entire array (because you have Part 1 and Part
2, but no Part 3), but it provides very fast disk access.
It is not often used in production environments, but it could be used in a situation where you have
strictly temporary data that can be lost without repercussions. It is used somewhat commonly for
caching devices (such as an L2Arc device).
The total usable disk space is the sum of all the disks in the array added together (e.g. 3x 1TB
disks = 3TB of space)


RAID 1
Good when: You have limited number of disks but need redundancy
Bad when: You need a lot of storage space
RAID 1 (aka Mirroring) takes your data and duplicates it identically on two or more disks
(although typically only 2 disks). If more than two disks are used the same information is stored
on each disk (they're all identical). It is the only way to ensure data redundancy when you have
less than three disks.
RAID 1 sometimes improves read performance. Some implementations of RAID 1 will read from
both disks to double the read speed. Some will only read from one of the disks, which does not
provide any additional speed advantages. Others will read the same data from both disks,
ensuring the array's integrity on every read, but this will result in the same read speed as a single
disk.
It is typically used in small servers that have very little disk expansion, such as 1RU servers that
may only have space for two disks or in workstations that require redundancy. Because of its
high overhead of "lost" space, it can be cost prohibitive with small-capacity, high-speed (and
high-cost) drives, as you need to spend twice as much money to get the same level of usable
storage.
The total usable disk space is the size of the smallest disk in the array (e.g. 2x 1TB disks = 1TB
of space).


RAID 1E
The 1E RAID level is similar to RAID 1 in that data is always written to (at least) two disks. But
unlike RAID1, it allows for an odd number of disks by simply interleaving the data blocks among
several disks.
Performance characteristics are similar to RAID1, fault tolerance is similar to RAID 10. This
scheme can be extended to odd numbers of disks more than three (possibly called RAID 10E,
though rarely).


RAID 10
Good when: You want speed and redundancy
Bad when: You can't afford to lose half your disk space
RAID 10 is a combination of RAID 1 and RAID 0. The order of the 1 and 0 is very important. Say
you have 8 disks, it will create 4 RAID 1 arrays, and then apply a RAID 0 array on top of the 4
RAID 1 arrays. It requires at least 4 disks, and additional disks have to be added in pairs.
This means that one disk from each pair can fail. So if you have sets A, B, C and D with disks
A1, A2, B1, B2, C1, C2, D1, D2, you can lose one disk from each set (A,B,C or D) and still have
a functioning array.
However, if you lose two disks from the same set, then the array is totally lost. You can lose up
to (but not guaranteed) 50% of the disks.
You are guaranteed high speed and high availability in RAID 10.
RAID 10 is a very common RAID level, especially with high capacity drives where a single disk
failure makes a second disk failure more likely before the RAID array is rebuilt. During recovery,
the performance degradation is much lower than its RAID 5 counterpart as it only has to read
from one drive to reconstruct the data.
The avaliable disk space is 50% of the sum of the total space. (e.g. 8x 1TB drives = 4TB of
usable space). If you use different sizes, only the smallest size will be used from each disk.


RAID 01
Good when: never
Bad when: always
It is the reverse of RAID 10. It creates two RAID 0 arrays, and then puts a RAID 1 over the top.
This means that you can lose one disk from each set (A1, A2, A3, A4 or B1, B2, B3, B4).
To be absolutely clear:
If you have a RAID10 array with 8 disks and one dies (we'll call it A1) then you'll have 6
redundant disks and 1 without redundancy. If another disk dies there's a 85% chance your
array is still working.
If you have a RAID01 array with 8 disks and one dies (we'll call it A1) then you'll have 4
redundant disks and 3 without redundancy. If another disk dies there's a 57% chance your
array is still working.
It provides no additional speed over RAID 10, but substantially less redundancy and should be
avoided at all costs.

RAID 5
Good when: You want a balance of redundancy and disk space or have a mostly
random read workload.
Bad when: You have a high random write workload or large drives.
RAID 5 has been the most commonly used RAID level for decades. It provides the system
performance of all the drives in the array (except for small random writes, which incur a slight
overhead). It uses a simple XOR operation to calculate parity. Upon single drive failure, the
information can be reconstructed from the remaining drives using the XOR operation on the
known data.
Unfortunately, in the event of a drive failure, the rebuilding process is very IO intensive. The
larger the drives in the RAID, the longer the rebuild will take, and the higher the chance for a
second drive failure. Since large slow drives both have a lot more data to rebuild and a lot less
performance to do it with, it is not usually recommended to use RAID5 with anything 7200 RPM
or lower.
It is also imperative that RAID 5 be put behind a reliable (battery backed) write cache. This
avoids the overhead for small writes, as well as flaky behaviour that can occur upon a failure in
the middle of a write.
RAID 5 is the most cost effective solution of adding redundant storage to an array, as it requires
the loss of only 1 disk (E.g. 12x 146GB disks = 1606GB of usable space). It requires a minimum
of 3 disks.


RAID 6
Good when: You want to use RAID 5, but your disks are too large or slow
Bad when: You have a high random write workload.
RAID 6 is similar to RAID 5 but it uses two disks worth of parity instead of just one (the first is
XOR, the second is a LSFR), so you can lose two disks from the array with no data loss. The
write penalty is higher than RAID 5 and you have one less disk of space.


RAID 50
Good when: You have a lot of disks that need to be in a single array and RAID 10
isn't an option because of capacity.
Bad when: You have so many disks that many simultaneous failures are possible
before rebuilds complete. Or when you don't have many disks.
RAID 50 is a nested level, much like RAID 10. It combines two or more RAID 5 arrays and
stripes data across them in a RAID 0. This offers both performance and multiple disk
redundancy, as long as multiple disks are lost from different RAID 5 arrays.
In a RAID 50, disk capacity is n-x, where x is the number of RAID 5s that are striped across. For
example, if a simple 6 disk RAID 50, the smallest possible, if you had 6x1TB disks in two RAID
5s that were then striped across to become a RAID 50, you would have 4TB usable storage.

RAID 60
Good when: You have a similar use case to RAID 50, but need more redundancy.
Bad when: You don't have a substantial number of disks in the array.
RAID 6 is to RAID 60 as RAID 5 is to RAID 50. Essentially, you have more than one RAID 6 that
data is then striped across in a RAID 0. This setup allows for up to two members of any individual
RAID 6 in the set to fail without data loss. Rebuild times for RAID 60 arrays can be substantial,
so it's usually a good idea to have one hot-spare for each RAID 6 member in the array.
In a RAID 60, disk capacity is n-2x, where x is the number of RAID 6s that are striped across. For
example, if a simple 8 disk RAID 60, the smallest possible, if you had 8x1TB disks in two RAID
6s that were then striped across to become a RAID 60, you would have 4TB usable storage. As
you can see, this gives the same amount of usable storage that a RAID 10 would give on an 8
member array. While RAID 60 would be slightly more redundant, the rebuild times would be
substantially larger. Generally, you want to consider RAID 60 only if you have a large number of
disks.

RAID-Z
Good when: You are using ZFS on a system that supports it.
Bad when: Performance demands hardware RAID acceleration.
RAID-Z is a bit complicated to explain since ZFS radically changes how storage and file systems
interact. ZFS encompasses the traditional roles of volume management (RAID is a function of a
Volume Manager) and file system. Because of this, ZFS can do RAID at the file's storage block
level rather than at the volume's strip level. This is exactly what RAID-Z does, write the file's
storage blocks across multiple physical drives including a parity block for each set of stripes.
An example may make this much more clear. Say you have 3 disks in a ZFS RAID-Z pool, the
block size is 4KB. Now you write a file to the system that is exactly 16KB. ZFS will split that into
four 4KB blocks (as would a normal operating system); then it will calculate two blocks of parity.
Those six blocks will be placed on the drives similar to how RAID-5 would distribute data and
parity. This is an improvement over RAID5 in that there was no reading of existing data stripes to
calculate the parity.
Another example builds on the previous. Say the file was only 4KB. ZFS will still have to build
one parity block, but now the write load is reduced to 2 blocks. The third drive will be free to
service any other concurrent requests. A similar effect will be seen anytime the file being written
is not a multiple of the pool's block size multiplied by the number of drives less one (ie [File Size]
<> [Block Size] * [Drives - 1]).
ZFS handling both Volume Management and File System also means you don't have to worry
about aligning partitions or stripe-block sizes. ZFS handles all that automatically with the
recommended configurations.
The nature of ZFS counteracts some of the classic RAID-5/6 caveats. All writes in ZFS are done
in a copy-on-write fashion; all changed blocks in a write operation are written to a new location
on disk, instead of overwriting the existing blocks. If a write fails for any reason, or the system
fails mid-write, the write transaction either occurs completely after system recovery (with the help
of the ZFS intent log) or does not occur at all, avoiding potential data corruption. Another issue
with RAID-5/6 is potential data loss or silent data corruption during rebuilds; regular zpool
scrub operations can help to catch data corruption or drive issues before they cause data loss,
and checksumming of all data blocks will ensure that all corruption during a rebuild is caught.
The main disadvantage to RAID-Z is that it is still software raid (and suffers from the same minor
latency incurred by the CPU calculating the write load instead of letting a hardware HBA offload
it). This may be resolved in the future by HBA's that support ZFS hardware acceleration.
Other RAID and Non-Standard
Functionality
Because there's no central authority enforcing any sort of standard functionality the various RAID
levels have evolved and been standardized by prevalent use. Many vendors have produced
products which deviate from the above descriptions. It's also quite common for them to invent
some fancy new marketing terminology to describe one of the above concepts (this happen most
frequently in the SOHO market). When possible, try to get the vendor to actually describe the
functionality of the redundancy mechanism (most will volunteer this information, as there's really
no secret sauce anymore).
Worth mentioning, there are RAID 5 like implementations which allow you to start an array with
only two disks. It would store data on one stripe and parity on the other, similar to RAID 5 above.
This would perform like RAID 1 with the extra overhead of the parity calculation. The advantage
is that you could add disks to the array by recalculating the parity.

Você também pode gostar