The following is something to consider when setting up RAID arrays. At the
moment AFAIK every RAID solution suffers from this problem. :(
I have a Linux software RAID-1 array consisting of two IBM IDE hard drives.
The latest kernel works the same way as the 2.4.2 kernel I am using on that
machine.
I have just had them both fail at the same time! They both had quite a
number of bad sectors, however there was no sector that was bad on both
disks!
The result I would have liked to see would be that when a bad sector is
encountered during a read from disk 0, then disk 1 should then be read. If
the data can be read from disk 1 then it should be written back to disk 0.
If after that disk 0 can be read (the likely result using sector-sparing in
hardware) then it should give lots of huge kprintf() errors and keep running.
The result I saw was that disk 0 was marked as failed, then when a different
sector failed on disk 1 the ext2 file system saw errors, the system stopped
functioning correctly and needed a hard reset. Then it paniced on boot
because it couldn't add either disk to the RAID-1. Since then I have been
trying to recover it. I wrote a program to read both disks and take data
from disk 1, but take it from disk 0 when disk 1 returned a bad sector. But
this didn't work well because disk 1 had run for some time without disk 0.
In summary a situation which could have been salvaged by an emergency visit
to a computer store turned into a catastrophy. :(
--
http://www.coker.com.au/bonnie++/ Bonnie++ hard drive benchmark
http://www.coker.com.au/postal/ Postal SMTP/POP benchmark
http://www.coker.com.au/projects.html Projects I am working on
http://www.coker.com.au/~russell/ My home page
--
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]