RAID arrays can survive the failure of one or more disk drives, but they aren't invulnerable. If you ignore the warning messages when the first drive fails, more could follow, eventually making the array fail. Sometimes more than one drive, or even the whole array, will fail at the same time. They're usually in the same location, exposed to the same risks.
Causes of RAID Failure
A RAID array of spinning disk drives is subject to the same risks as individual drives. The leading causes of failure are:
- Defects in manufacture. Some drives are defective from the start and slip past quality control. The result is a failure of the electronics or the surface of the drive, usually not long after installation.
- Physical damage. Catastrophic events like fires and floods can ruin drives, and they'll affect all the disks they reach. Impact, either from dropping or from letting something massive strike the drive, can cause permanent harm. These events may force the drive head into contact with the disk, damaging the disk's surface. This is known as a head crash.
- Overheating. Improper ventilation makes a drive run above its acceptable temperature range. This can cause warping of the disk; only a tiny warp is necessary to make it unusable. It may also damage the electronics or the magnetic coating on the drive.
- Power surges. A large electrical surge can destroy a drive's electronics. The disk is still good, but the controller has turned into a piece of junk.
- Software failure. Software may write corrupted data, by accident or through malware infection. Writing one malformed sector in the wrong place can make the whole file system useless.
Any of these problems can affect multiple disks at once. Sometimes they will only make one drive fail, but others in the same array could be damaged and fail soon afterward. The array controller may fail, causing a RAID failure, even if the component drives are still intact.
The best protection against RAID failure is backup, preferably to a remote location. Using RAID reduces the chance of data loss but doesn't eliminate it. Malware and user error can wipe out data, even if the drive is functioning perfectly. Always keep your storage backed up.
RAID Array Protection
Depending on their level, the failure of one or more component drives doesn't necessarily mean RAID failure. Here are the numbers for the most common RAID levels:
- RAID 0 has no redundancy. Any drive's failure means the array's failure.
- RAID 1 consists of 2 or more mirrored disks. It can survive the failure of all but one of them.
- RAID 4 and 5 use block-level striping with parity. The data can be reconstructed if any one drive fails, but not more than one.
- RAID 6 is like RAID 5 with an additional parity block. It can survive the failure of two drives.
- RAID 10 combines mirroring and striping. It can survive as long as one drive in each striping pair is functional, so it can always withstand the loss of one drive.
- RAID 50 adds mirroring to RAID 5, and RAID 60 adds mirroring to RAID 6. These arrays have failure thresholds similar to RAID 10.
Handling a RAID failure
The failure of any drive in an array is a serious matter, even if it's designed to keep running. When one drive has failed, others may be on the verge of failing. The array should be taken offline as quickly as possible, to avoid additional damage. Replacing the failed drive is a straightforward task for IT personnel, but the array needs testing afterward to make sure there are no residual problems.
If there's an actual RAID failure, there's little that the typical IT department can do to repair it. Any service work requires specialized equipment and skills. NJ Data Recovery Labs' skilled technicians know how to deal with this type of problem. In many cases, we can fully recover the data.
The chances of recovery are highest if you take the array out of service as soon as there is a clear sign of failure. Some arrays keep limping along after losing a drive, but continued use may cause additional damage.
How NJ Data Recovery Labs Can Help
You never know when you'll have to deal with a RAID failure, whether it's from an overvoltage, a physical shock, or an unexplained component failure. But you can be prepared.
We handle RAID failures of all types, on all drive models. Our server recovery services include clean room recovery and restoration of databases. We provide 24/7 service and quick response, so you'll be back in operation with a minimum of downtime. Our prices are competitive, and pickup and delivery are free anywhere in New Jersey. If you ever need to recover a failed RAID array — or if you're looking here because you need it right now — give us a call. We're ready to help.