When to replace hard drive in a RAID array
-
Replacing a drive that is showing signs of failure, or of having failed is going to result in the array becoming degraded and having to resilver.
Personally I would do it sooner rather than later. But the end result is the same.
-
Also I wish you luck with the resilver process.
-
What are the sizes of the drives?
-
@Texkonc 3 TB (WD Red)
-
Also, be proactive and order the drive now. Personally I would order 2.
I also like to keep one spare on hand if you are setup without hotspare. -
If you have the space, back the data up to another location and blow away the RAID 5, toss the sick drive, rebuild into a RAID 10. Far less risk that way vs. add a new drive and pray that it rebuilds, plus no extra disks needed. The sooner the better on making a new array, I don't know if I would risk replacing a drive in a RAID 5 array (I'm making the assumption that these are 1TB + drives, which means that you have about as much chance of a successful rebuild as you have of getting hit by lightning).
-
@Texkonc I have one in stock already, and I agree with you and @DustinB3403 it would be wiser to go ahead and take care of this now. Especially since I don't have a solid idea of when I can get to the project of getting off the RAID 5.
-
@EddieJennings said in When to replace hard drive in a RAID array:
@Texkonc 3 TB (WD Red)
Ouch... if you try to rebuild that array and it works (don't hold your breath), I'd go out and buy a ticket to every lottery you can, because you'll never have that kind of luck again.
-
NOW is the time to move to something other than RAID 5. The most dangerous thing that you can do is replacing that drive. Even once it fails, you don't replace it. Whoever designed that system made the decision that a failed drive meant moving off of the NAS when they installed it (you should explain this as an existing decision to management.)
The time that data gets lots is in the resilver.
-
That's a 12TB failure domain on 5400RPM consumer drives. The resilver options will take days and the chances of success are way below 50%. So any drive replacement means you INTEND for all data on the array to be lost. You easily might get lucky and survive. But chances are, you won't. So only replace that drive if you plan for days of downtime, and then at a random point during those days of outage to have all of the data be lost.
-
@RojoLoco said in When to replace hard drive in a RAID array:
@EddieJennings said in When to replace hard drive in a RAID array:
@Texkonc 3 TB (WD Red)
Ouch... if you try to rebuild that array and it works (don't hold your breath), I'd go out and buy a ticket to every lottery you can, because you'll never have that kind of luck again.
I had StoreVirtual dual node SAN with 24 4TB drives take 7.5 days to repair without issue or popping another drive. Raid 6 thankfully though.
Edit: Thankfully I got Dev/QA to approve shutting down servers for a week that would not be need to lessen the load. Might have been bad times if I didnt. -
@Texkonc said in When to replace hard drive in a RAID array:
@RojoLoco said in When to replace hard drive in a RAID array:
@EddieJennings said in When to replace hard drive in a RAID array:
@Texkonc 3 TB (WD Red)
Ouch... if you try to rebuild that array and it works (don't hold your breath), I'd go out and buy a ticket to every lottery you can, because you'll never have that kind of luck again.
I had StoreVirtual dual node SAN with 24 4TB drives take 7.5 days to repair without issue or popping another drive. Raid 6 thankfully though.
I don't know if I could handle 7.5 days without sleep!
-
@scottalanmiller The chance of failure on the resilver is what frightens me, which, contrary to what I posted a couple of minutes ago, makes me want to make the drive swap happen when I redo the RAID as RAID 10. For that matter, I'll also look and see what the cost would be to add drives to the server that connects to the NAS via iSCSI and just have the data stored locally.
-
@EddieJennings said in When to replace hard drive in a RAID array:
@scottalanmiller The chance of failure on the resilver is what frightens me...
That's why to deal with the whole thing now. Consider it an emergency situation.
-
Wealth of knowledge has been gained in the last few minutes -- in particular how long it would take to resilver an array, which puts into perspective how dangerous RAID 5 is.
-
@RojoLoco said in When to replace hard drive in a RAID array:
If you have the space, back the data up to another location and blow away the RAID 5, toss the sick drive, rebuild into a RAID 10. Far less risk that way vs. add a new drive and pray that it rebuilds, plus no extra disks needed. The sooner the better on making a new array, I don't know if I would risk replacing a drive in a RAID 5 array (I'm making the assumption that these are 1TB + drives, which means that you have about as much chance of a successful rebuild as you have of getting hit by lightning).
^ this exactly. Order larger drives today if you have too little space after RAID10 conversion.
-
@RojoLoco said in When to replace hard drive in a RAID array:
@Texkonc said in When to replace hard drive in a RAID array:
@RojoLoco said in When to replace hard drive in a RAID array:
@EddieJennings said in When to replace hard drive in a RAID array:
@Texkonc 3 TB (WD Red)
Ouch... if you try to rebuild that array and it works (don't hold your breath), I'd go out and buy a ticket to every lottery you can, because you'll never have that kind of luck again.
I had StoreVirtual dual node SAN with 24 4TB drives take 7.5 days to repair without issue or popping another drive. Raid 6 thankfully though.
I don't know if I could handle 7.5 days without sleep!
Trust me, I woke up some nights to see if my VPN was still up. (an RRAS VM) then log into storage and check the percent complete.
-
@EddieJennings said in When to replace hard drive in a RAID array:
@scottalanmiller For that matter, I'll also look and see what the cost would be to add drives to the server that connects to the NAS via iSCSI and just have the data stored locally.
That would likely make way more sense.
-
Yeah they're still going to be of the same quality (NAS drives) but you'd be in a non-parity array.
Adding more drives would be a boost if you can fit them in as the entire system will operate that much more quickly.
-
@Texkonc said in When to replace hard drive in a RAID array:
@RojoLoco said in When to replace hard drive in a RAID array:
@Texkonc said in When to replace hard drive in a RAID array:
@RojoLoco said in When to replace hard drive in a RAID array:
@EddieJennings said in When to replace hard drive in a RAID array:
@Texkonc 3 TB (WD Red)
Ouch... if you try to rebuild that array and it works (don't hold your breath), I'd go out and buy a ticket to every lottery you can, because you'll never have that kind of luck again.
I had StoreVirtual dual node SAN with 24 4TB drives take 7.5 days to repair without issue or popping another drive. Raid 6 thankfully though.
I don't know if I could handle 7.5 days without sleep!
Trust me, I woke up some nights to see if my VPN was still up. (an RRAS VM) then log into storage and check the percent complete.
This is why I keep a bottle of Pepto + sleeping pills in my tech emergency kit.