Hello all, I'm having a problem that I find difficult to diagnose.
I have an IBM x3550 M3 running nexenta core platform 3.0.1 (134f) with 7x6 disk RAIDZ2 vdevs (see listing at bottom). Every day a disk fails with "Too many checksum errors", is marked as degraded and rebuilt onto a hot spare. I've been doing 'zpool detach zpool002 <degraded disk>' to remove it from the zpool and return the pools status to 'ONLINE'. Later that day (or sometimes the next day), a disk is marked as degraded due to checksum errors and is rebuilt onto a hot spare again, rinse, repeat. We've been logging this stuff for the past few days and there are a few things to notice however: 1. The disk that fails appears to be the hot spare that we rebuilt on to the previous time 2. If I don't detach the degraded disk then the newly rebuilt hot spare does not seem to fail I'm just doing a scrub now to confirm there are no further checksum errors and then I will detach the 'degraded' drive from the pool and see if the new hot spare fails in the next 24 hours. Just wondering if anyone had seen this before? Thanks, Ashley pool: zpool002 state: DEGRADED status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scan: scrub in progress since Mon Jan 24 17:17:39 2011 25.3G scanned out of 3.91T at 25.9M/s, 43h38m to go 0 repaired, 0.63% done config: NAME STATE READ WRITE CKSUM zpool002 DEGRADED 0 0 0 raidz2-0 ONLINE 0 0 0 c8t5000C50020C780C3d0 ONLINE 0 0 0 c8t5000C50020C785FBd0 ONLINE 0 0 0 c8t5000C50020C7610Bd0 ONLINE 0 0 0 c8t5000C50020C77413d0 ONLINE 0 0 0 c8t5000C50020C77437d0 ONLINE 0 0 0 c8t5000C50020DC9AE7d0 ONLINE 0 0 0 raidz2-1 DEGRADED 0 0 0 c8t5000C50020DCBDCFd0 ONLINE 0 0 0 c8t5000C50020E3E85Fd0 ONLINE 0 0 0 c8t5000C50020E3F5FBd0 ONLINE 0 0 0 c8t5000C50020E3F37Bd0 ONLINE 0 0 0 c8t5000C50020E3F337d0 ONLINE 0 0 0 spare-5 DEGRADED 0 0 202 c8t5000C5001034370Bd0 DEGRADED 0 0 23 too many errors c8t5000C50020E3F617d0 ONLINE 0 0 0 raidz2-2 ONLINE 0 0 0 c8t5000C50020E9E6FFd0 ONLINE 0 0 0 c8t5000C50020E33C97d0 ONLINE 0 0 0 c8t5000C50020E94A63d0 ONLINE 0 0 0 c8t5000C50020E94E4Bd0 ONLINE 0 0 0 c8t5000C50020E233CFd0 ONLINE 0 0 0 c8t5000C50020E3447Fd0 ONLINE 0 0 0 raidz2-3 ONLINE 0 0 0 c8t5000C50020E9549Bd0 ONLINE 0 0 0 c8t5000C50020E20003d0 ONLINE 0 0 0 c8t5000C50020E28723d0 ONLINE 0 0 0 c8t5000C50020E32873d0 ONLINE 0 0 0 c8t5000C50020E95887d0 ONLINE 0 0 0 c8t5000C50020E96577d0 ONLINE 0 0 0 raidz2-4 ONLINE 0 0 0 c8t5000C50010384D1Fd0 ONLINE 0 0 0 c8t5000C50021176F43d0 ONLINE 0 0 0 c8t5000C50021177B3Bd0 ONLINE 0 0 0 c8t5000C500211785F3d0 ONLINE 0 0 0 c8t5000C500211792AFd0 ONLINE 0 0 0 c8t5000C500211795C3d0 ONLINE 0 0 0 raidz2-5 ONLINE 0 0 0 c8t5000C50025CCFEEBd0 ONLINE 0 0 0 c8t5000C500104D7BEFd0 ONLINE 0 0 0 c8t5000C500104D7FE7d0 ONLINE 0 0 0 c8t5000C500104DD5AFd0 ONLINE 0 0 0 c8t5000C500104DD43Bd0 ONLINE 0 0 0 c8t5000C500104DD78Bd0 ONLINE 0 0 0 raidz2-6 ONLINE 0 0 0 c8t5000C500104DDF17d0 ONLINE 0 0 0 c8t5000C500104DE287d0 ONLINE 0 0 0 c8t5000C500104E3BE7d0 ONLINE 0 0 0 c8t5000C500104E3D83d0 ONLINE 0 0 0 c8t5000C500104E3F9Fd0 ONLINE 0 0 0 c8t5000C50010353C77d0 ONLINE 0 0 0 logs c3d0 ONLINE 0 0 0 c4d0 ONLINE 0 0 0 cache c2d1 ONLINE 0 0 0 spares c8t5000C50020E3F617d0 INUSE currently in use c8t5000C50021177453d0 AVAIL c8t5000C5002117792Fd0 AVAIL c8t5000C50021177297d0 AVAIL
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss