Re: [zfs-discuss] Best way to convert checksums

Richard Elling Fri, 02 Oct 2009 14:07:22 -0700

Hi Miles, good to hear from you again.

On Oct 2, 2009, at 1:20 PM, Miles Nordin wrote:

"re" == Richard Elling <richard.ell...@gmail.com> writes:
"r" == Ross  <myxi...@googlemail.com> writes:


   re> The answer to this question must be known before the
   re> effectiveness of a checksum can be evaluated.

...well...we can use math to know that a checksum is effective.  What
you are really suggesting we evaluate ``empirically'' is the degree of
INeffectiveness of the broken checksum.


By your logic, SECDED ECC for memory is broken because it only
corrects 1 bit per symbol and only detects brokeness of 2 bits per
symbol. However, the empirical evidence suggests that ECC provides
a useful function for many people. Do we know how many triple bit
errors occur in memories? I can compute the probability, but have
never seen a field failure analysis. So, if ECC is "good enough" for
DRAM, is fletcher2 "good enough" for storage?

NB, for DRAM the symbol size is usually 64 bits. For the ZFS case, the
symbol size is 4,096 to 1,048,576 bits. AFAIK, no collisions have been
found in SHA-256 digests for symbols of size 1,048,576, but it has not
been proven that that they do not exist.

    r> ZFS stores two copies of the metadata for any block, so
    r> corrupt metadata really shouldn't happen often.

the other copy probably won't be read if the first copy read has a
valid checksum.  I think it'll more likely just lazy-panic instead.
If that's the case, the two copies won't help cover up the broken
checksum bug.  but Richard's table says metadata has fletcher4 which
the OP said is as good as the correct algorithm would have been, even
in its broken implementation, so long as it's only used up to
128kByte.  It's only data and ZIL that has the relevantly-broken
checksum, according to his math.

   re> The overwhelming empirical evidence suggests that fletcher2
   re> catches many storage system corruptions.

What do you mean by the word ``many''?  It's a weasel-word.


I'll blame the lawyers. They are causing me to remove certain words
from my vocabulary :-(

 It
basically means, AFAICT, ``the broken checksum still trips
sometimes.''  But have you any empirical evidence about the fraction
of real world errors which are still caught by the broken checksum
vs. those that are not?  I don't see how you could.

Question for the zfs-discuss participants, have you seen a datacorruption

that was not detected when using fletcher2?

Personally, I've seen many corruptions of data stored on file systems
lacking checksums.

How about cases where checksums are not used to correct bit-flip
gremlins but relied upon to determine whether a data structure is
fully present (committed) yet, like in the ZIL, or to determine which
half of a mirror is stale---these are cases where checksums could be
wrong even if the storage subsystem is functioning in an ideal way.

Checksum weakness on ZFS where checksums are presumed good by other
parts of the design could potentially be worse overall than a
checksumless design.  That's not my impression, but it's the right
place to put the bar.  Ray's ``well at least it's better than no
checksums'' is wrong because it presumes ZFS could function as well as
another filesystem if ZFS were using a hypothetical null checksum.  It
couldn't.

I'm in Ray's camp. I've got far to many scars from data corruption andI'd

rather not add more.
 -- richard


Anyway I'm glad the problem is both fixed and also avoidable on the
broken systems.  I just think the doublespeak after the fact is, once
again, not helping anyone.
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Best way to convert checksums

Reply via email to