Re: Failing disk advice

David Christensen Sun, 05 Mar 2017 20:39:07 -0800

On 03/05/2017 01:02 PM, Gregory Seidman wrote:

I have a disk that is reporting SMART errors. It is an active disk in a
(kernel, not hardware) RAID1 configuration. I also have a hot spare in the
RAID1, and md hasn't decided it should fail the disk and switch to the hot
spare. Should I proactively tell md to fail the disk (and let the hot spare
take over), or should I just wait until md notices a problem?

AFAIK desktop disks and "enterprise RAID" disks degrade differently.When a desktop disk is having trouble reading a sector, it will retrymany times before giving up because it is likely the data does not existanywhere else. But, an enterprise RAID disc will retry only a few timesand then fail; because the data should exist elsewhere and hung readsare intolerable in enterprise environments. So, if you are usingdesktop disks in a RAID, you might need to manually intervene tocompensate for the mismatch.

I'm confused by "I also have a hot spare in the RAID1". Do you have atwo-member RAID1 with a hot spare, or a three-member RAID1? I wouldprefer the latter:


https://manpages.debian.org/jessie/mdadm/md.4.en.html

If you're planning on buying a fourth disk and adding it after fixingthe RAID, can you add it now as a fourth RAID1 member, let it resilver,remove the failing disk from the RAID (e.g. reconfigure as three-memberRAID1), and then pull the failing disk?



David

Re: Failing disk advice

Reply via email to