>>>>> "nw" == Nicolas Williams <nicolas.willi...@sun.com> writes:
nw> Your thesis is that all corruption problems observed with ZFS nw> on SANs are: a) phantom writes that never reached the rotating nw> rust, b) not bit rot, corruption in the I/O paths, ... nw> Correct? yeah. by ``all'' I mean the several single-LUN pools that were recovered by using an older set of ueberblocks. Of course I don't mean ``all'' as in all pools imagineable including this one 10 years ago on an unnamed Major Vendor's RAID shelf that gave you a scar just above the ankle. But it is really sounding so far like just one major problem with single-LUN ZFS's on SAN's? or am I wrong, there are lots of pools which can't be recovered with old ueberblocks? Remember the problem is losing pools. It is not, ``for weeks I kept losing files. I would get errors reported in 'zpool status', and it would tell me the filename 'blah' has uncorrectable errors. This went on for a while, then one day we lost the whole pool.'' I've heard zero reports like that. nw> Some of the earlier problems of type (2) were triggered by nw> checksum verification failures on pools with no redundancy, but checksum failures aren't caused just by bitrot in ZFS. I get hundreds of them after half of my iSCSI mirror bounces because of the incomplete-resilvering bug. I don't know the on-disk format well, but maybe the checksum was wrong because the label pointed to a block that wasn't an ueberblock. Maybe the checksum is functioning in leiu of a commit sector: maybe all four ueberblocks were written incompletely because there is some bug or missing-workaround in the way ZFS flushes and schedules the ueberblock writes, so with some written sectors and some unwritten sectors the overall block checksum is wrong. Maybe this is a downside to the filesystem-level checksum. For integrity it's an upside, but the netapp block-level checksum, where you checksum just the data plus the block-number at RAID layer, should narrow down checksum failures to disk bit flips only and thus be better for tracking down problems and building statistics comparable with other systems. We already know the 'zpool status' CKSUM column isn't so selective, and can catch out-of-date data too. The overall point, what I'd rather have as my ``thesis,'' is you can't allow ZFS to exhonerate itself with an error message. Losing the whole pool in a situation where UFS would (or _might_, is not even proven beyond doubt that it _would_), have corrupted a bit of data, isn't an advantage just because ZFS can printf a warning that says ``loss of entire pool detected. must be corruption outside ZFS!''
pgpnWFqKEeiVx.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss