Re: [zfs-discuss] integrated failure recovery thoughts (single-bit

Anton B. Rang Tue, 12 Aug 2008 21:15:37 -0700

Reed-Solomon could correct multiple-bit errors, but an effective Reed-Solomon 
code for 128K blocks of data would be very slow if implemented in software 
(and, for that matter, take a lot of hardware to implement). A multi-bit 
Hamming code would be simpler, but I suspect that undetected multi-bit errors 
are quite rare.


I've seen a fair number of single-bit errors coming from SATA drives because 
the data is often not parity-protected through the whole data path within the 
drive. Some enterprise-class SATA disks have data protected (with a 
parity-equivalent) through the write data path, and more of these models will 
have this feature soon. All SAS and FibreChannel drives (that I am aware of) 
have data protected with ECC through the whole path for both reads and writes.

Single-bit errors can also be introduced in non-ECC DRAM, of course. In this 
case, it can happen either before the checksum computation (=> undetected data 
corruption) or after it (=> checksum failure on a later read).
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] integrated failure recovery thoughts (single-bit

Reply via email to