Given that the checksum algorithms utilized in zfs are already fairly CPU 
intensive, I
can't help but wonder if it's verified that a majority of checksum 
inconsistency failures
appear to be single bit; if it may be advantageous to utilize some 
computationally
simpler hybrid form of a checksum/hamming code (as you've suggested), such that
although a simpler hybrid form would not be able to detect as high a percentage 
all
possible failures, it would be capable of correcting a theoretical majority 
while retaining
an ability to detect a large majority of all possible remaining errors (which 
correspondingly
would be known to occur with less frequency) and ideally consume no more than 
the
exiting checksum algorithm overhead, while simultaneously improving the apparent
resilience of even non-otherwise redundantly configured storage devices.

(although I confess I haven't done such an analysis yet, I suspect someone 
already more
intimately familiar with error detection/correcting algorithm 
implementation/trade-offs
may have some interesting suggestions, as currently having a strong detection 
capability
without an ability to recover that which may otherwise be easily recoverable in 
lieu of
potentially catastrophic data loss does not seem reasonable).
 
 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to