Re: [zfs-discuss] Replacing faulty disk in ZFS pool

Don Turnbull Thu, 06 Aug 2009 17:51:09 -0700

If her adds the spare and then manually forces a replace, it will takeno more time than any other way. I do this quite frequently and withoutneeding the scrub which does take quite a lot of time.


cindy.swearin...@sun.com wrote:

Hi Andreas,


Good job for using a mirrored configuration. :-)

Your various approaches would work.

My only comment about #2 is that it might take some time for the spare
to kick in for the faulted disk.

Both 1 and 2 would take a bit more time than just replacing the faulted
disk with a spare disk, like this:

# zpool replace tank c1t6d0 c1t15d0

Then you could physically replace c1t6d0 and add it back to the pool as
a spare, like this:

# zpool add tank spare c1t6d0

For a production system, the steps above might be the most efficient.
Get the faulted disk replaced with a known good disk so the pool is
no longer degraded, then physically replace the bad disk when you have
the time and add it back to the pool as a spare.

It is also good practice to run a zpool scrub to ensure the
replacement is operational and use zpool clear to clear the previous

errors on the pool. If the system is used heavily, then you might wantto run the zpool scrub when system use is reduced.


If you were going to physically replace c1t6d0 while it was still
attached to the pool, then you might offline it first.

Cindy

On 08/06/09 13:17, Andreas Höschler wrote:

Dear managers,

one of our servers (X4240) shows a faulty disk:

------------------------------------------------------------------------
-bash-3.00# zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c1t0d0s0  ONLINE       0     0     0
            c1t1d0s0  ONLINE       0     0     0

errors: No known data errors

  pool: tank
 state: DEGRADED
status: One or more devices are faulted in response to persistent
errors.
        Sufficient replicas exist for the pool to continue functioning
in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the
device
        repaired.
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        tank        DEGRADED     0     0     0
          mirror    ONLINE       0     0     0
            c1t2d0  ONLINE       0     0     0
            c1t3d0  ONLINE       0     0     0
          mirror    ONLINE       0     0     0
            c1t5d0  ONLINE       0     0     0
            c1t4d0  ONLINE       0     0     0
          mirror    DEGRADED     0     0     0
            c1t6d0  FAULTED      0    19     0  too many errors
            c1t7d0  ONLINE       0     0     0

errors: No known data errors
------------------------------------------------------------------------
I derived the following possible approaches to solve the problem:

1) A way to reestablish redundancy would be to use the command

       zpool attach tank c1t7d0 c1t15d0

to add c1t15d0 to the virtual device "c1t6d0 + c1t7d0". We still would
have the faulty disk in the virtual device.

We could then dettach the faulty disk with the command

       zpool dettach tank c1t6d0

2) Another approach would be to add a spare disk to tank

       zpool add tank spare c1t15d0

and the replace to replace the faulty disk.

       zpool replace tank c1t6d0 c1t15d0

In theory that is easy, but since I have never done that and since this
is a productive server I would appreciate if somone with more
experience would look on my agenda before I issue these commands.

What is the difference between the two approaches? Which one do you
recommend? And is that really all that has to be done or am I missing a
bit? I mean can c1t6d0 be physically replaced after issuing "zpool
dettach tank c1t6d0" or "zpool replace tank c1t6d0 c1t15d0"? I also
found the command

       zpool offline tank  ...

but am not sure whether this should be used in my case. Hints are
greatly appreciated!

Thanks a lot,

  Andreas

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discus
s

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Replacing faulty disk in ZFS pool

Reply via email to