Anyone using RAID of any type: hardware, md, LVM, Btrfs, ZFS, really
need to be aware of drive SCT ERC and kernel SCSI command timer
mismatches. The mismatch happens by default if you're using consumer
hard drives, many of which now either have SCT ERC disabled by default
or do not support it. This mismatch will eventually lead to array
collapse, even in the face of just a one disk failure for RAID5 or 2
disk failure for RAID6. RAID1 is a bit more tolerant but only because
a sector read error on a degraded raid1 only means the array doesn't
go offline, other files can still be retrieved in such a case -
assuming the sector errors haven't resulted in data corruption.

The linux-raid@ list is full of these kinds of horror stories.
Invariably it's a - yep, your raid5 with just one dead drive? It's
toast, or at least very tedious to recover data from, because one or
more surviving drives have one or more bad sectors and md can't
recover from that automatically.


Chris Murphy
-- 
users mailing list
users@lists.fedoraproject.org
To unsubscribe or change subscription options:
https://admin.fedoraproject.org/mailman/listinfo/users
Fedora Code of Conduct: http://fedoraproject.org/code-of-conduct
Guidelines: http://fedoraproject.org/wiki/Mailing_list_guidelines
Have a question? Ask away: http://ask.fedoraproject.org

Reply via email to