On 04/16/09 04:39, casper....@sun.com wrote:
You really believe that the copy was copied and checksummed twice before
writing to the disk? Of course not. Copying the data doesn't help;
both pieces of memory need to be good. It's checksummed once.
If OpenSolaris succeeds in being significantly adopted as a desktop o/s,
it is going to be running on some pretty grotty hardware. No ecc memory,
cheap PCI controllers, etc. Clearly this computer has hardware problems,
my guess is the PCI itself, although it seems to run Linux and OpenSolaris
quite happily. If the memory is so bad that two separately computed check-
sums fail, then I doubt it would run anything reliably. FWIW it passes
every diagnostic I've run, but that doesn't prove anything...
ZFS can't catch the case where the data is bad before it is checksummed,
so we can ignore that one for this discussion. This scenario seems to have
bad checksums or bad data (or both) being written to both disks.
So why not copy and store the data+checksum twice? In the grand scheme of
things, it is hard to believe that this would add significant overhead,
(maybe even speed things up if both disks can be written in parallel?)
and it would help diagnosing what is a novel problem.
Let CSA and CSB = stored checksums, and CRA and CRB be the recomputed
checksums after the data is read from each mirror. Presumably a scrub
always reads both sides of a mirror, so all permutations are possible.
One interesting case is where CSA == CSB and CRA == CRB but CSA != CRA
vs. the case where all 4 checksums are different. It seems improbable
that two disks would fail in the same way at the same moment, so this
scenario would point at some other source of error.
It would be helpful to know which scenario is happening.
Good old reliable Sun products with ecc bus and memory simply don't
have this kind of problem. The hardware detects it long before it becomes
a software issue. Not so with el-cheapo PCs whose owners will likely
be frustrated (see the "[zfs-discuss] How recoverable is an 'unrecoverable
error'?" thread) when their previously seemingly reliable disks start to
apparently fail in mysterious ways.
I'd like to submit an RFE suggesting that data + checksum be copied for
mirrored writes, but I won't waste anyone's time doing so unless you
think there is a point. One might argue that a machine this flaky should
be retired, but it is actually working quite well, and perhaps represents
not even the extreme of bad hardware that ZFS might encounter.
Cheers -- Frank
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss