Re: [zfs-discuss] Spare drive inherited cksum errors?

Richard Elling Sun, 27 May 2012 15:37:57 -0700

On May 27, 2012, at 12:52 PM, Stephan Budach wrote:

> Hi, 
> 
> today I issued a scrub on one of my zpools and after some time I noticed that 
> one of the vdevs became degraded due to some drive having cksum errors. The 
> spare kicked in and the drive got resilvered, but why does the spare drive 
> now also show almost the same number of cksum errors, as the degraded drive?


The answer is not available via zpool status. You will need to look at the FMA 
diagnosis:
        fmadm faulty

more clues can be found in the FMA error reports:
        fmdump -eV

 -- richard


> 
> root@solaris11c:~# zpool status obelixData
>   pool: obelixData
>  state: DEGRADED
> status: One or more devices has experienced an unrecoverable error.  An
>         attempt was made to correct the error.  Applications are unaffected.
> action: Determine if the device needs to be replaced, and clear the errors
>         using 'zpool clear' or replace the device with 'zpool replace'.
>    see: http://www.sun.com/msg/ZFS-8000-9P
>   scan: resilvered 1,12T in 10h50m with 0 errors on Sun May 27 21:15:32 2012
> config:
> 
>         NAME                         STATE     READ WRITE CKSUM
>         obelixData                   DEGRADED     0     0     0
>           mirror-0                   ONLINE       0     0     0
>             c9t2100001378AC02DDd1    ONLINE       0     0     0
>             c9t2100001378AC02F4d1    ONLINE       0     0     0
>           mirror-1                   ONLINE       0     0     0
>             c9t2100001378AC02F4d0    ONLINE       0     0     0
>             c9t2100001378AC02DDd0    ONLINE       0     0     0
>           mirror-2                   ONLINE       0     0     0
>             c9t2100001378AC02DDd2    ONLINE       0     0     0
>             c9t2100001378AC02F4d2    ONLINE       0     0     0
>           mirror-3                   ONLINE       0     0     0
>             c9t2100001378AC02DDd3    ONLINE       0     0     0
>             c9t2100001378AC02F4d3    ONLINE       0     0     0
>           mirror-4                   ONLINE       0     0     0
>             c9t2100001378AC02DDd5    ONLINE       0     0     0
>             c9t2100001378AC02F4d5    ONLINE       0     0     0
>           mirror-5                   ONLINE       0     0     0
>             c9t2100001378AC02DDd4    ONLINE       0     0     0
>             c9t2100001378AC02F4d4    ONLINE       0     0     0
>           mirror-6                   ONLINE       0     0     0
>             c9t2100001378AC02DDd6    ONLINE       0     0     0
>             c9t2100001378AC02F4d6    ONLINE       0     0     0
>           mirror-7                   ONLINE       0     0     0
>             c9t2100001378AC02DDd7    ONLINE       0     0     0
>             c9t2100001378AC02F4d7    ONLINE       0     0     0
>           mirror-8                   ONLINE       0     0     0
>             c9t2100001378AC02DDd8    ONLINE       0     0     0
>             c9t2100001378AC02F4d8    ONLINE       0     0     0
>           mirror-9                   DEGRADED     0     0     0
>             c9t2100001378AC02DDd9    ONLINE       0     0     0
>             spare-1                  DEGRADED     0     0    10
>               c9t2100001378AC02F4d9  DEGRADED     0     0    22  too many 
> errors
>               c9t2100001378AC02BFd1  ONLINE       0     0    23
>           mirror-10                  ONLINE       0     0     0
>             c9t2100001378AC02DDd10   ONLINE       0     0     0
>             c9t2100001378AC02F4d10   ONLINE       0     0     0
>           mirror-11                  ONLINE       0     0     0
>             c9t2100001378AC02DDd11   ONLINE       0     0     0
>             c9t2100001378AC02F4d11   ONLINE       0     0     0
>           mirror-12                  ONLINE       0     0     0
>             c9t2100001378AC02DDd12   ONLINE       0     0     0
>             c9t2100001378AC02F4d12   ONLINE       0     0     0
>           mirror-13                  ONLINE       0     0     0
>             c9t2100001378AC02DDd13   ONLINE       0     0     0
>             c9t2100001378AC02F4d13   ONLINE       0     0     0
>           mirror-14                  ONLINE       0     0     0
>             c9t2100001378AC02DDd14   ONLINE       0     0     0
>             c9t2100001378AC02F4d14   ONLINE       0     0     0
>         logs
>           mirror-15                  ONLINE       0     0     0
>             c9t2100001378AC02D9d0    ONLINE       0     0     0
>             c9t2100001378AC02BFd0    ONLINE       0     0     0
>         spares
>           c9t2100001378AC02BFd1      INUSE     currently in use
> 
> 
> What would be the best way to proceed? The drive c9t2100001378AC02BFd1 is the 
> spare drive, that is tagged as ONLINE, but it shows 23 cksum errors, while 
> the drive that became degraded only shows 22 cksum errors.
> 
> What would be the best procedure to continue? Would one now first run another 
> scrub and detach the degraded drive afterwards, or detach the degrades drive 
> immediately and run a scrub afterwards?
> 
> Thanks,
> budy
> 
> 
> 
>  -- 
> Stephan Budach
> Jung von Matt/it-services GmbH
> Glashüttenstraße 79
> 20357 Hamburg
> 
> 
> Tel: +49 40-4321-1353
> Fax: +49 40-4321-1114
> E-Mail: stephan.bud...@jvm.de
> Internet: http://www.jvm.com
> 
> Geschäftsführer: Frank Wilhelm, Stephan Budach (stellv.)
> AG HH HRB 98380
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

--
ZFS Performance and Training
richard.ell...@richardelling.com
+1-760-896-4422

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Spare drive inherited cksum errors?

Reply via email to