>>>>> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
re> If your pool is not redundant, the chance that data re> corruption can render some or all of your data inaccessible is re> always present. 1. data corruption != unclean shutdown 2. other filesystems do not need a mirror to recover from unclean shutdown. They only need it for when disks fail, or for when disks misremember their contents (silent corruption, as in NetApp paper). I would call data corruption and silent corruption the same thing: what the CKSUM column was _supposed_ to count, though not in fact the only thing it counts. 3. saying ZFS needs a mirror to recover from unclean shutdown does not agree with the claim ``always consistent on the disk'' 4. I'm not sure exactly your position. Before you were saying what Erik warned about doesn't happen, because there's no CR, and Tom must be confused too. Now you're saying of course it happens, ZFS's claims of ``always consistent on disk'' count for nothing unless you have pool redundancy. And that is exactly what I said to start with: re> In general, ZFS can only repair conditions for which it owns re> data redundancy. c> If that's really the excuse for this situation, then ZFS is c> not ``always consistent on the disk'' for single-VDEV pools. that is the take-home message? If so, it still leaves me with the concern, what if the breaking of one component in a mirrored vdev takes my system down uncleanly? This seems like a really plausible failure mode (as Tom said, ``the inevitable kernel panic''). In that case, I no longer have any redundancy when the system boots back up. If ZFS calls the inconsistent states through which it apparently sometimes transitions pools ``data corruption'' and depends on redundancy to recover from them, then isn't it extremely dangerous to remove power or SAN connectivity from any DEGRADED pool? The pool should be rebuilt onto a hot spare IMMEDIATELY so that it's ONLINE as soon as possible, because if ZFS loses power with a DEGRADED pool all bets are off. If this DEGRADED-pool unclean shutdown is, as you say, a completely different scenario from single-vdev pools that isn't dangerous and has no trouble with ZFS corruption, then no one should ever run a single-vdev pool. We should instead run mirrored vdevs that are always DEGRADED, since this configuration looks identical to everything outside ZFS but supposedly magically avoids the issue. If only we had some way to attach to vdevs fake mirror components that immediately get marked FAULTED then we can avoid the corruption risk. But, that's clearly absurd! so, let's say ZFS's requirement is, as we seem to be describing it: might lose the whole pool if your kernel panics or you pull the power cord in a situation without redundancy. Then I think this is an extremely serious issue, even for redundant pools. It is very plausible that a machine will panic or lose power during a resilver. And if, on the other hand, ZFS doesn't transition disks through inconsistent states and then excuse itself calling what it did ``data corruption'' when it bites you after an unclean shutdown, then what happened to Erik and Tom? It seems to me it is ZFS's fault and can't be punted off to the administrator's ``asking for it.''
pgpX6I9cpJdn1.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss