Frank Middleton <f.middle...@apogeect.com> writes: > Exactly. My whole point. And without ECC there's no way of knowing. > But if the data is damaged /after/ checksum but /before/ write, then > you have a real problem...
we can't do much to protect ourselves from damage to the data itself (an extra copy in RAM will help little and ruin performance). damages to the bits holding the computed checksum before it is written can be alleviated by doing the calculation independently for each written copy. in particular, this will help if the bit error is transient. since the number of octets in RAM holding the checksum dwarves the number of octets occupied by data by a large ratio (256 bits vs. one mebibit for a full default sized record), such a paranoia mode will most likely tell you that the *data* is corrupt, not the checksum. but today you don't know, so it's an improvement in my book. > Quoting the ZFS admin guide: "The failmode property ... provides the > failmode property for determining the behavior of a catastrophic > pool failure due to a loss of device connectivity or the failure of > all devices in the pool. ". Has this changed since the ZFS admin > guide was last updated? If not, it doesn't seem relevant. I guess checksum error handling is orthogonal to this and should have its own property. it sure would be nice if the admin could ask the OS to deliver the bits contained in a file, no matter what, and just log the problem. > Cheers -- Frank thank you for pointing out this potential weakness in ZFS' consistency checking, I didn't realise it was there. also thank you, all ZFS developers, for your great job :-) -- Kjetil T. Homme Redpill Linpro AS - Changing the game _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss