Sep 9, 2009

Basic Raid Level Information

I found this information helpful so, as usual, I copied it somewhere I can find it quickly. There is always the Wikipedia RAID page as well.

In the IT world, hardware failure is not about if it will happen, but when it will happen. If you run a server that has any sort of important data on it, protection of that data is very important. Many people chose to implement a redundant RAID (redundant array of independent disks) array to help deal with the risk of having a hardware failure. There are several types of RAID that are appropriate for servers, and there are several ways they can be effectively implemented.

IBM patented the idea of RAID in 1978. It was not until 1988 that the RAID levels that we have come to know were defined. This development was done at University of California in Berkeley. Now days RAID is used in many servers throughout the world and even in desktop machines.

RAID 1
The use of a mirroring RAID array, or RAID 1 is useful in server situations. It creates an exact copy of the original drive. If either of the drives fail, the system can continue operations without any downtime. Then the new hard drive can be put into the system and it can rebuild the array.

This system is considered a little less desirable than a RAID 5 setup for most situations of day-to-day operation. However, it has several applications where the use of RAID 1 can be beneficial. One advantage is that it has a faster seek time than RAID 5, which makes it beneficial for data that, will not be written to often. The main advantage is that some 1U servers do not have room for a 3 drive array, so implementing RAID 1 is often considered better for reliability than no RAID at all. However, the most useful way that I have seen RAID 1 used in the real world is as backups. With a hot-swappable setup, the mirror disk can be removed and kept as backup much the same way as a tape backup can be stored. This proves to be very useful for mission critical systems since it allows for a system to be brought back online after a data failure, or the system to be brought up on separate hardware after a catastrophic hardware failure.

RAID 5
Probably the most common disk array used in enterprise computing is a RAID 5 array. This is because it maximizes disk usage, reliability, and speed of access. To get an idea of how it works, there are usually 3 drives in an array that each have their data divided between two other drives.

On mission critical servers, RAID 5 is often used with a cache that has an attached battery backup. This ensures that in a power failure, no transactions are lost from the server. Often times database servers with high amounts of transactions will have a battery unit since the RAID card would cache transactions before waiting, a power failure could result in an inconsistent database or critical data loss.

RAID 10
This type of RAID array requires 4 or more drives. At the top level is a RAID 0 array which combines lower level RAID 1 arrays. This type of RAID array has a benefit over RAID 5 in that it has faster write times. This often makes it a little bit better of a choice than RAID 5 for database servers.

Space Calculations
To calculate RAID 1 you simply divide the total drive space by 2. For RAID 5 you multiply the total space of the drives by the number of drives over 1 to get total usable space. Then for RAID 10 you add up each RAID 1 array.

RAID1
2 80 GB Drives
80/2=80

RAID5
3 80 GB Drives
240*(2/3)=160

5 80 GB Drives
400*(4/5)=320

RAID10
4 80GB Drives
2 RAID 1 Arrays = 160GB usable

Hardware RAID

The use of hardware RAID arrays no longer makes as much sense as it once did. There are still reasons to use a hardware RAID over a software RAID. The first reason to use hardware RAID is that it usually has a cache, which speeds up the operation of the array dramatically. The second advantage is that it will not cut into system resources as much as software RAID. The biggest advantage is in the possibility of having a battery backed up cache. This will help prevent corruption from an unexpected power issue or a system crash.

Software RAID
Although historically all RAID arrays were completely hardware based, there is a growing popularity of software RAID. One of the reasons for this is that CPU speeds are now fast enough that the processing time involved managing the RAID array is really minimal compared to the overall processor.

One of the major advantages of software RAID is that it can be setup on commodity hardware so the physical disks can easily be moved to another server in the event of a hardware failure that does not involve the disks. The biggest disadvantage is that software raid has no cache so the limit to the speed data can flow from the Operating System is the limit of the drives.

Notifications
Lets assume you have a working RAID setup now. Now if a drive fails the system continues like nothing ever happened. The problem is, without notification of a drive failure, there is no reason to have raid. So make sure you setup a system to notify you whenever a drive fails. It may notify you by email, page you, or on some control panel. The important thing is that you know before two drives fail and render the array useless.

Conclusion
Hopefully you will be able to make some important purchasing decisions for you next server after reading this article. There are a lot of things to consider when planning data availability so make sure you spend enough time to get everything right. Remember that no single RAID setup is best for all applications.

by Tyler Weaver

No comments: