>>>>> "csb" == Craig S Bell <cb...@standard.com> writes:
csb> Two: If you lost data with another filesystem, you may have csb> overlooked it and blamed the OS or the application, yeah, but with ZFS you often lose the whole pool in certain classes of repeatable real-world failures, like hotswap disks with flakey power or SAN's without NVRAM where the target reboots and the initiator does not. Losing the whole pool is relevantly different to corrupting the insides of a few files. Yes, I know, the red-eyed screaming ZFS rats will come out of the walls screaming ``that 1 bit could have been critical Banking Data on which millions of lives depend and nuclear reactors and spaceships too! Wouldn't you rather KNOW, even if ZFS desides to inform with zpool_self-destruct_condescending-error()?'' Maybe, sometimes, yes, but USUALLY, **NO**! I've no objection to deciding how much recovery tools are needed based on experience rather than wide-eyed kool-aid ranting or presumptions from earlier filesystems, but so far experience says the recovery work was really needed, so I can't agree with the bloggers rehashing each other's zealotry. It would be nice to isolate and fix the underlying problems, though. That is the spirit in all these ``we don't need no fsck because we are perfect'' blogs with which I do agree. Their overoptimism isn't as honest as I'd like about the way ZFS's error messages do not enough to lead us toward the real cause in the case of SAN problems because they are all designed presuming spatially-clustered, temporally-spread, disk-based failures rather than temporally-clustered interconnect failures, so rather the error detection becomes no more than ``printf("simon sez u will not blame me, blame someone else. these aren't the droids you're looking for. move along.");'' ....but, yeah, the blogger's point of banging on the whole stack until it works rather than concealing errors, is a good one. Unfortunately I don't think that's what will actually happen with these dropped-write SAN failures. I think people will just use the new recovery bits, which conceal errors just like earlier filesystems and fsck tools, and shrug.
pgpRg4gotskPU.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss