Re: [zfs-discuss] more ZFS recovery

Richard Elling Wed, 06 Aug 2008 11:29:41 -0700

Miles Nordin wrote:
>>>>>> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
>>>>>> "tb" == Tom Bird <[EMAIL PROTECTED]> writes:
>>>>>>             
>
>     tb> There was a problem with the SAS bus which caused various
>     tb> errors including the inevitable kernel panic, the thing came
>     tb> back up with 3 out of 4 zfs mounted.
>
>     re> In general, ZFS can only repair conditions for which it owns
>     re> data redundancy.
>
> If that's really the excuse for this situation, then ZFS is not
> ``always consistent on the disk'' for single-VDEV pools.
>


I disagree with your assessment.  The on-disk format (any on-disk format)
necessarily assumes no faults on the media.  The difference between ZFS
on-disk format and most other file systems is that the metadata will be
consistent to some point in time because it is COW.  With UFS, for instance,
the metadata is overwritten, which is why it cannot be considered always
consistent (and why fsck exists).

> There was no loss of data here, just an interruption in the connection
> to the target, like power loss or any other unplanned shutdown.
> Corruption in this scenario is is a significant regression w.r.t. UFS:
>   

I see no evidence that the data is or is not correct.  What we know is that
ZFS is attempting to read something and the device driver is returning EIO.
Unfortunately, EIO is a catch-all error code, so more digging to find the
root cause is needed.

However, I will bet a steak dinner that if this device was mirrored to 
another,
the pool will import just fine, with the affected device in a faulted or 
degraded
state.

>   http://mail.opensolaris.org/pipermail/zfs-discuss/2008-June/048375.html
>   

I have no idea what Eric is referring to, and it does not match my 
experience.
Unfortunately, he didn't reference any CRs either :-(.  "Your baby is 
ugly" posts
aren't very useful.

That said, we are constantly improving the resiliency of ZFS (more good
stuff coming in b96), so it might be worth trying to recover with a later
version.  For example, boot SXCE b94 and try to import the pool.

> How about the scenario where you lose power suddenly, but only half of
> a mirrored VDEV is available when power is restored?  Is ZFS
> vulnerable to this type of unfixable corruption in that scenario,
> too?
>   

No, this works just fine as long as one side works.  But that is a very 
different case.
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] more ZFS recovery

Reply via email to