DISCLAIMERS:

ZFS bits on this server are old:
        # pkginfo -l SUNWzfsr |grep -i version
        VERSION:  11.11,REV=2006.01.03.01.17
OS is an old build of Nevada:
        SunOS 5.11 snv_31

Experts,

I have what is hopefully a simple question. We have a ZFS pool (dilbert) consisting of 6 2-way mirrors. Each of the mirrors consists of 2 drives on seperate controllers on a D1000. Recently one of the drives croaked. I attempted to offline it, which should have worked as the data on it was mirrored, but failed:

# zpool status -v
  pool: dilbert
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool online' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: none requested
config:

        NAME         STATE     READ WRITE CKSUM
        dilbert      ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t0d0   ONLINE       0     0     0
            c2t8d0   ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t1d0   ONLINE       0     0     0
            c2t9d0   ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t2d0   ONLINE       0     0     0
            c2t10d0  ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t3d0   ONLINE       0     0     0
            c2t11d0  ONLINE      55 130.2     0
          mirror     ONLINE       0     0     0
            c1t4d0   ONLINE       0     0     0
            c2t12d0  ONLINE       0     0     0
          mirror     ONLINE       0     0     0
            c1t5d0   ONLINE       0     0     0
            c2t13d0  ONLINE       0     0     0
# zpool offline dilbert c2t11d0
cannot offline /dev/dsk/c2t11d0: no valid replicas
#

We replaced the failed drive with a good one, which started the resilvering:

<walk to server room, replace drive>
# devfsadm
# zpool replace dilbert c2t11d0
# zpool status -v
  pool: dilbert
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress, 0.35% done, 0h28m to go
config:

        NAME               STATE     READ WRITE CKSUM
        dilbert            DEGRADED     0     0     0
          mirror           ONLINE       0     0     0
            c1t0d0         ONLINE       0     0     0
            c2t8d0         ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t1d0         ONLINE       0     0     0
            c2t9d0         ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t2d0         ONLINE       0     0     0
            c2t10d0        ONLINE       0     0     0
          mirror           DEGRADED     0     0     0
            c1t3d0         ONLINE       0     0     0
            replacing      DEGRADED     0     0     0
              c2t11d0s0/o  FAULTED     55 152.6     0  cannot open
              c2t11d0      ONLINE       0     0     0  4.54M resilvered
          mirror           ONLINE       0     0     0
            c1t4d0         ONLINE       0     0     0
            c2t12d0        ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t5d0         ONLINE       0     0     0
            c2t13d0        ONLINE       0     0     0
#

This appears to have worked fine, but now (3 days later) the pool is still in a degraded state althought he resilvering appears to have completed.

# zpool status
  pool: dilbert
 state: DEGRADED
 scrub: resilver completed with 0 errors on Fri Jun  1 10:31:39 2007
config:

        NAME               STATE     READ WRITE CKSUM
        dilbert            DEGRADED     0     0     0
          mirror           ONLINE       0     0     0
            c1t0d0         ONLINE       0     0     0
            c2t8d0         ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t1d0         ONLINE       0     0     0
            c2t9d0         ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t2d0         ONLINE       0     0     0
            c2t10d0        ONLINE       0     0     0
          mirror           DEGRADED     0     0     0
            c1t3d0         ONLINE       0     0     0
            replacing      DEGRADED     0     0     0
              c2t11d0s0/o  FAULTED     55 152.6     0  cannot open
              c2t11d0      ONLINE       0     0     0  3.16G resilvered
          mirror           ONLINE       0     0     0
            c1t4d0         ONLINE       0     0     0
            c2t12d0        ONLINE       0     0     0
          mirror           ONLINE       0     0     0
            c1t5d0         ONLINE       0     0     0
            c2t13d0        ONLINE       0     0     0
#

Is there some additional step we need to take in order to complete the process of replacing a failed disk?

Thanks.

--Steve


*************************************

Steve Person, SCSA 8/9, RHCT
Systems/Network Administrator
Hillsboro Operations Engineering
Sun Microsystems, Inc.
Office: 503-342-3264
Fax:    503-342-3264
[EMAIL PROTECTED]

*************************************
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to