On Tue, Mar 23, 2010 at 07:22:59PM -0400, Frank Middleton wrote: > On 03/22/10 11:50 PM, Richard Elling wrote: > >> Look again, the checksums are different. > > Whoops, you are correct, as usual. Just 6 bits out of 256 different... > > Look which bits are different - digits 24, 53-56 in both cases.
This is very likely an error introduced during the calculation of the hash, rather than an error in the input data. I don't know how that helps narrow down the source of the problem, though.. It suggests an experiment: try switching to another hash algorithm. It may move the problem around, or even make it worse, of course. I'm also reminded of a thread about the implementation of fletcher2 being flawed, perhaps you're better switching regardless. >>> o Why is the file flagged by ZFS as fatally corrupted still accessible? > > This is the part I was hoping to get answers for since AFAIK this > should be impossible. Since none of this is having any operational > impact, all of these issues are of interest only, but this is a bit scary! It's only the blocks with bad checksums that should return errors. Maybe you're not reading those, or the transient error doesn't happen next time when you actually try to read it / from the other side of the mirror. Repeated errors in the same file could also be a symptom of an error calculating the hash when the file was written. If there's a bit-flipping issue at the root of it, with some given probability, that would invert the probabilities of "correct" and "error" results. -- Dan.
pgpGRgBlRkr4l.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss