On Mon, 3 Mar 2008, Nathan Kroenert wrote: > Speaking of expensive, but interesting things we could do - > > From the little I know of ZFS's checksum, it's NOT like the ECC > checksum we use in memory in that it's not something we can use to > determine which bit flipped in the event that there was a single bit > flip in the data. (I could be completely wrong here... but...)
It seems that the emphasis on single-bit errors may be misplaced. Is there evidence which suggests that single-bit errors are much more common than multiple bit errors? > What is the chance we could put a little more resilience into ZFS such > that if we do get a checksum error, we systematically flip each bit in > sequence and check the checksum to see if we could in fact proceed > (including writing the data back correctly.). It is easier to retry the disk read another 100 times or store the data in multiple places. > Or build into the checksum something analogous to ECC so we can choose > to use NON-ZFS protected disks and paths, but still have single bit flip > protection... Disk drives commonly use an algorithm like Reed Solomon (http://en.wikipedia.org/wiki/Reed-Solomon_error_correction) which provides forward-error correction. This is done in hardware. Doing the same in software is likely to be very slow. > What do others on the list think? Do we have enough folks using ZFS on > HDS / EMC / other hardware RAID(X) environments that might find this useful? It seems that since ZFS is intended to support extremely large storage pools, available energy should be spent ensuring that the storage pool remains healthy or can be repaired. Loss of individual file blocks is annoying, but loss of entire storage pools is devastating. Since raw disk is cheap (and backups are expensive), it makes sense to write more redundant data rather than to minimize loss through exotic algorithms. Even if RAID is not used, redundant copies may be used on the same disk to help protect against block read errors. Bob ====================================== Bob Friesenhahn [EMAIL PROTECTED], http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss