[zfs-discuss] ZFS: clarification on meaning of the autoreplace property

Dave Johnson Wed, 17 Mar 2010 09:26:30 -0700

>From pages 29,83,86,90 and 284 of the 10/09 Solaris ZFS Administration
guide, it sounds like a disk designated as a hot spare will:
1. Automatically take the place of a bad drive when needed
2. The spare will automatically be detached back to the spare
   pool when a new device is inserted and brought up to replace the
   original compromised one.


Should this work the same way for slices?

I have four active disks in a RAID 10 configuration,
for a storage pool, and the same disks are used
for mirrored root configurations, but only
only one of the possible mirrored root slice
pairs is currently active.

I wanted to designate slices on a 5th disk as
hot spares for the two existing pools, so
after partitioning the 5th disk (#4) identical
to the four existing disks, I ran:

# zpool add rpool spare c0t4d0s0
# zpool add store1 spare c0t4d0s7
# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s0  ONLINE       0     0     0
            c0t1d0s0  ONLINE       0     0     0
        spares
          c0t4d0s0    AVAIL

errors: No known data errors

  pool: store1
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        store1        ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s7  ONLINE       0     0     0
            c0t1d0s7  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t2d0s7  ONLINE       0     0     0
            c0t3d0s7  ONLINE       0     0     0
        spares
          c0t4d0s7    AVAIL

errors: No known data errors
--
So It looked like everything was set up how I was
hoping until I emulated a disk failure by pulling
one of the online disks. The root pool responded
how I expected, but the storage pool, on slice 7,
did not appear to perform the autoreplace:

Not too long after pulling one of the online disks:

--------------------
# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver in progress for 0h0m, 10.02% done, 0h5m to go
config:

        NAME            STATE     READ WRITE CKSUM
        rpool           DEGRADED     0     0     0
          mirror        DEGRADED     0     0     0
            c0t0d0s0    ONLINE       0     0     0
            spare       DEGRADED    84     0     0
              c0t1d0s0  REMOVED      0     0     0
              c0t4d0s0  ONLINE       0     0    84  329M resilvered
        spares
          c0t4d0s0      INUSE     currently in use

errors: No known data errors

  pool: store1
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        store1        ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t0d0s7  ONLINE       0     0     0
            c0t1d0s7  ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t2d0s7  ONLINE       0     0     0
            c0t3d0s7  ONLINE       0     0     0
        spares
          c0t4d0s7    AVAIL

errors: No known data errors
--------------------
I was able to convert the state of store1 to DEGRADED by
writing to a file in that storage pool, but it always listed
the spare as available. This at the same time as showing
c0t1d0s7 as REMOVED in the same pool

Based on the manual, I expected the system to bring a
reinserted disk back on line automatically, but zpool status
still showed it as "REMOVED". To get it back on line:

# zpool detach rpool c0t4d0s0
# zpool clear rpool
# zpool clear store1

Then status showed *both* pools resilvering. So the questions are:

1. Does autoreplace work on slices, or just complete disks?
2. Is there a problem replacing a "bad" disk with the same disk
   to get the autoreplace function to work?
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] ZFS: clarification on meaning of the autoreplace property

Reply via email to