I am running zfs 3 on SunOS zen 5.10 Generic_118855-33 i86pc i386 i86pc

What is baffling is that the disk did come online and appear as healthy, but
zpool showed the fs inconsistency. As Miles said, after the disk came back
the resilver did not resume.
The only additions i have to the sequence shown are:
1) i am absolutely sure there were no disk writes in the interim since  the
non-global zones which use these fses were halted during the operation
2) The first time i unplugged the disk, i upgraded to a larger disk so i
still have that original disk intact
3) i was afraid that zfs might resilver backwards, ie from the 22% image
back to the original copy. I therefore pulled the new disk out again.

Current status:
# zpool status
  pool: external
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver completed with 0 errors on Sat Jun 21 07:42:03 2008
config:

        NAME           STATE     READ WRITE CKSUM
        external       ONLINE   26.57   114     0
          c12t0d0p0    ONLINE       4   114     0
          mirror       ONLINE   26.57     0     0
            c13t0d0p0  ONLINE   55.25 4.48K     0
            c16t0d0p0  ONLINE       0     0 53.14

Can i be sure that the unrecoverable error found is on the failed mirror?

I was thinking of the following ways forward. Any comments most welcome:
1) run a scrub. I am thinking that kicking this off might actually corrupt
data in the second vdev, so maybe starting off with 2 might be better
idea...
2) physically replace disk1 with ORIGINAL disk2 and attempt a scrub

justin


-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Miles Nordin
Sent: 21 June 2008 02:46
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] zfs mirror broken?

>>>>> "jb" == Jeff Bonwick <[EMAIL PROTECTED]> writes:

    jb> If you say 'zpool online <pool> <disk>' that should tell ZFS
    jb> that the disk is healthy again and automatically kick off a
    jb> resilver.

    jb> Of course, that should have happened automatically.

with b71 I find that it does sometimes happen automatically, but the
resilver isn't enough to avoid checksum errors later.  Only a
manually-requested scrub will stop any more checksum errors from
accumulating.

Also, if I reboot before one of these auto-resilvers finishes, or plug in
the component that flapped while powered down, the auto-resilver never
resumes.

    >> While one vdev was resilvering at 22% (HD replacement), the
    >> original disk went away 

so if I understand you, it happened like this:

    #1                #2

  online             online
t online             UNPLUG
i online             UNPLUG        <-- filesystem writes
m online             UNPLUG        <-- filesystem writes
e online             online
| online resilver -> online
v UNPLUG    xxx      online        --> fs reads allowed?  how?
  online             online        why no resilvering?

It seems to me like DTRT after #1 is unplugged is to take the whole pool
UNAVAIL until the original disk #1 comes back.  When the original disk #1
drops off, the only available component left is the #2 component that
flapped earlier and is being resilvered, so #2 is out-of-date and should be
ignored.  but I'm pretty sure ZFS doesn't work that way, right?

What does it do?  Will it serve incorrect, old data?  Will it somehow return
I/O errors for data that has changed on #1 and not been resilvered onto #2
yet?

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to