Hello all,

I'm having a problem that I find difficult to diagnose.

I have an IBM x3550 M3 running nexenta core platform 3.0.1 (134f) with 7x6
disk RAIDZ2 vdevs (see listing at bottom).
Every day a disk fails with "Too many checksum errors", is marked as
degraded and rebuilt onto a hot spare. I've been doing 'zpool detach
zpool002 <degraded disk>' to remove it from the zpool and return the pools
status to 'ONLINE'. Later that day (or sometimes the next day), a disk is
marked as degraded due to checksum errors and is rebuilt onto a hot spare
again, rinse, repeat.

We've been logging this stuff for the past few days and there are a few
things to notice however:
1. The disk that fails appears to be the hot spare that we rebuilt on to the
previous time
2. If I don't detach the degraded disk then the newly rebuilt hot spare does
not seem to fail

I'm just doing a scrub now to confirm there are no further checksum errors
and then I will detach the 'degraded' drive from the pool and see if the new
hot spare fails in the next 24 hours. Just wondering if anyone had seen this
before?

Thanks,
Ashley

pool: zpool002
state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
scan: scrub in progress since Mon Jan 24 17:17:39 2011
    25.3G scanned out of 3.91T at 25.9M/s, 43h38m to go
    0 repaired, 0.63% done
config:

        NAME                         STATE     READ WRITE CKSUM
        zpool002                     DEGRADED     0     0     0
          raidz2-0                   ONLINE       0     0     0
            c8t5000C50020C780C3d0    ONLINE       0     0     0
            c8t5000C50020C785FBd0    ONLINE       0     0     0
            c8t5000C50020C7610Bd0    ONLINE       0     0     0
            c8t5000C50020C77413d0    ONLINE       0     0     0
            c8t5000C50020C77437d0    ONLINE       0     0     0
            c8t5000C50020DC9AE7d0    ONLINE       0     0     0
          raidz2-1                   DEGRADED     0     0     0
            c8t5000C50020DCBDCFd0    ONLINE       0     0     0
            c8t5000C50020E3E85Fd0    ONLINE       0     0     0
            c8t5000C50020E3F5FBd0    ONLINE       0     0     0
            c8t5000C50020E3F37Bd0    ONLINE       0     0     0
            c8t5000C50020E3F337d0    ONLINE       0     0     0
            spare-5                  DEGRADED     0     0   202
              c8t5000C5001034370Bd0  DEGRADED     0     0    23  too many
errors
              c8t5000C50020E3F617d0  ONLINE       0     0     0
          raidz2-2                   ONLINE       0     0     0
            c8t5000C50020E9E6FFd0    ONLINE       0     0     0
            c8t5000C50020E33C97d0    ONLINE       0     0     0
            c8t5000C50020E94A63d0    ONLINE       0     0     0
            c8t5000C50020E94E4Bd0    ONLINE       0     0     0
            c8t5000C50020E233CFd0    ONLINE       0     0     0
            c8t5000C50020E3447Fd0    ONLINE       0     0     0
          raidz2-3                   ONLINE       0     0     0
            c8t5000C50020E9549Bd0    ONLINE       0     0     0
            c8t5000C50020E20003d0    ONLINE       0     0     0
            c8t5000C50020E28723d0    ONLINE       0     0     0
            c8t5000C50020E32873d0    ONLINE       0     0     0
            c8t5000C50020E95887d0    ONLINE       0     0     0
            c8t5000C50020E96577d0    ONLINE       0     0     0
          raidz2-4                   ONLINE       0     0     0
            c8t5000C50010384D1Fd0    ONLINE       0     0     0
            c8t5000C50021176F43d0    ONLINE       0     0     0
            c8t5000C50021177B3Bd0    ONLINE       0     0     0
            c8t5000C500211785F3d0    ONLINE       0     0     0
            c8t5000C500211792AFd0    ONLINE       0     0     0
            c8t5000C500211795C3d0    ONLINE       0     0     0
          raidz2-5                   ONLINE       0     0     0
            c8t5000C50025CCFEEBd0    ONLINE       0     0     0
            c8t5000C500104D7BEFd0    ONLINE       0     0     0
            c8t5000C500104D7FE7d0    ONLINE       0     0     0
            c8t5000C500104DD5AFd0    ONLINE       0     0     0
            c8t5000C500104DD43Bd0    ONLINE       0     0     0
            c8t5000C500104DD78Bd0    ONLINE       0     0     0
          raidz2-6                   ONLINE       0     0     0
            c8t5000C500104DDF17d0    ONLINE       0     0     0
            c8t5000C500104DE287d0    ONLINE       0     0     0
            c8t5000C500104E3BE7d0    ONLINE       0     0     0
            c8t5000C500104E3D83d0    ONLINE       0     0     0
            c8t5000C500104E3F9Fd0    ONLINE       0     0     0
            c8t5000C50010353C77d0    ONLINE       0     0     0
        logs
          c3d0                       ONLINE       0     0     0
          c4d0                       ONLINE       0     0     0
        cache
          c2d1                       ONLINE       0     0     0
        spares
          c8t5000C50020E3F617d0      INUSE     currently in use
          c8t5000C50021177453d0      AVAIL
          c8t5000C5002117792Fd0      AVAIL
          c8t5000C50021177297d0      AVAIL
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to