> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Bob Friesenhahn
> 
> It is very unusual to obtain the same number of errors (probably same
> errors) from two devices in a pair.  This should indicate a common
> symptom such as a memory error (does your system have ECC?),
> controller glitch, or a shared power supply issue.

Bob's right.  I didn't notice that both sides of the mirror have precisely
56 checksum errors.  Ignore what I said about adding a 3rd disk to the
mirror.  It won't help.  The 3rd mirror would have only been useful if the
block corruption on these 2 disks weren't the same blocks.

I think you have to acknowledge the fact that you have corrupt data.  And
you should run some memory diagnostics on your system to see if you can
detect some failing memory.  The cause is not necessarily memory, as Bob
pointed out, but a typical way to produce the result you're seeing is ...
ZFS calculates a checksum of a block it's about to write to disk, and of
course that checksum is stored in ram.  Unfortunately, if it's stored in
corrupt ram, then ... when it's written to disk, of course the checksum will
mismatch.  And the faulty checksum gets written to both sides of the mirror.
It is discovered later during your scrub.  There is no un-corrupt copy of
the data that ZFS thought it wrote.

At least it's detected by ZFS.  Without checksumming, that error would pass
undetected.

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to