>>>>> "csb" == Craig S Bell <cb...@standard.com> writes:

   csb> Two: If you lost data with another filesystem, you may have
   csb> overlooked it and blamed the OS or the application,

yeah, but with ZFS you often lose the whole pool in certain classes of
repeatable real-world failures, like hotswap disks with flakey power
or SAN's without NVRAM where the target reboots and the initiator does
not.  Losing the whole pool is relevantly different to corrupting the
insides of a few files.  Yes, I know, the red-eyed screaming ZFS rats
will come out of the walls screaming ``that 1 bit could have been
critical Banking Data on which millions of lives depend and nuclear
reactors and spaceships too!  Wouldn't you rather KNOW, even if ZFS
desides to inform with zpool_self-destruct_condescending-error()?''
Maybe, sometimes, yes, but USUALLY, **NO**!

I've no objection to deciding how much recovery tools are needed based
on experience rather than wide-eyed kool-aid ranting or presumptions
from earlier filesystems, but so far experience says the recovery work
was really needed, so I can't agree with the bloggers rehashing each
other's zealotry.

It would be nice to isolate and fix the underlying problems, though.
That is the spirit in all these ``we don't need no fsck because we are
perfect'' blogs with which I do agree.  Their overoptimism isn't as
honest as I'd like about the way ZFS's error messages do not enough to
lead us toward the real cause in the case of SAN problems because they
are all designed presuming spatially-clustered, temporally-spread,
disk-based failures rather than temporally-clustered interconnect
failures, so rather the error detection becomes no more than
``printf("simon sez u will not blame me, blame someone else.  these
aren't the droids you're looking for.  move along.");'' ....but, yeah,
the blogger's point of banging on the whole stack until it works
rather than concealing errors, is a good one.  Unfortunately I don't
think that's what will actually happen with these dropped-write SAN
failures.  I think people will just use the new recovery bits, which
conceal errors just like earlier filesystems and fsck tools, and
shrug.

Attachment: pgpRg4gotskPU.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to