On Jan 24, 2011, at 9:52 AM, Ashley Nicholls wrote: > Hello all, > > I'm having a problem that I find difficult to diagnose. > > I have an IBM x3550 M3 running nexenta core platform 3.0.1 (134f) with 7x6 > disk RAIDZ2 vdevs (see listing at bottom). > Every day a disk fails with "Too many checksum errors", is marked as degraded > and rebuilt onto a hot spare. I've been doing 'zpool detach zpool002 > <degraded disk>' to remove it from the zpool and return the pools status to > 'ONLINE'. Later that day (or sometimes the next day), a disk is marked as > degraded due to checksum errors and is rebuilt onto a hot spare again, rinse, > repeat. > > We've been logging this stuff for the past few days and there are a few > things to notice however: > 1. The disk that fails appears to be the hot spare that we rebuilt on to the > previous time > 2. If I don't detach the degraded disk then the newly rebuilt hot spare does > not seem to fail > > I'm just doing a scrub now to confirm there are no further checksum errors > and then I will detach the 'degraded' drive from the pool and see if the new > hot spare fails in the next 24 hours. Just wondering if anyone had seen this > before?
I've seen this with SATA disks. Check the output of "fmdump -eV" and look at the error reports for the ZFS checksum errors. They should show the type of corruption detected. The type of corruption leads to further analysis opportunities. -- richard > > Thanks, > Ashley > > pool: zpool002 > state: DEGRADED > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scan: scrub in progress since Mon Jan 24 17:17:39 2011 > 25.3G scanned out of 3.91T at 25.9M/s, 43h38m to go > 0 repaired, 0.63% done > config: > > NAME STATE READ WRITE CKSUM > zpool002 DEGRADED 0 0 0 > raidz2-0 ONLINE 0 0 0 > c8t5000C50020C780C3d0 ONLINE 0 0 0 > c8t5000C50020C785FBd0 ONLINE 0 0 0 > c8t5000C50020C7610Bd0 ONLINE 0 0 0 > c8t5000C50020C77413d0 ONLINE 0 0 0 > c8t5000C50020C77437d0 ONLINE 0 0 0 > c8t5000C50020DC9AE7d0 ONLINE 0 0 0 > raidz2-1 DEGRADED 0 0 0 > c8t5000C50020DCBDCFd0 ONLINE 0 0 0 > c8t5000C50020E3E85Fd0 ONLINE 0 0 0 > c8t5000C50020E3F5FBd0 ONLINE 0 0 0 > c8t5000C50020E3F37Bd0 ONLINE 0 0 0 > c8t5000C50020E3F337d0 ONLINE 0 0 0 > spare-5 DEGRADED 0 0 202 > c8t5000C5001034370Bd0 DEGRADED 0 0 23 too many > errors > c8t5000C50020E3F617d0 ONLINE 0 0 0 > raidz2-2 ONLINE 0 0 0 > c8t5000C50020E9E6FFd0 ONLINE 0 0 0 > c8t5000C50020E33C97d0 ONLINE 0 0 0 > c8t5000C50020E94A63d0 ONLINE 0 0 0 > c8t5000C50020E94E4Bd0 ONLINE 0 0 0 > c8t5000C50020E233CFd0 ONLINE 0 0 0 > c8t5000C50020E3447Fd0 ONLINE 0 0 0 > raidz2-3 ONLINE 0 0 0 > c8t5000C50020E9549Bd0 ONLINE 0 0 0 > c8t5000C50020E20003d0 ONLINE 0 0 0 > c8t5000C50020E28723d0 ONLINE 0 0 0 > c8t5000C50020E32873d0 ONLINE 0 0 0 > c8t5000C50020E95887d0 ONLINE 0 0 0 > c8t5000C50020E96577d0 ONLINE 0 0 0 > raidz2-4 ONLINE 0 0 0 > c8t5000C50010384D1Fd0 ONLINE 0 0 0 > c8t5000C50021176F43d0 ONLINE 0 0 0 > c8t5000C50021177B3Bd0 ONLINE 0 0 0 > c8t5000C500211785F3d0 ONLINE 0 0 0 > c8t5000C500211792AFd0 ONLINE 0 0 0 > c8t5000C500211795C3d0 ONLINE 0 0 0 > raidz2-5 ONLINE 0 0 0 > c8t5000C50025CCFEEBd0 ONLINE 0 0 0 > c8t5000C500104D7BEFd0 ONLINE 0 0 0 > c8t5000C500104D7FE7d0 ONLINE 0 0 0 > c8t5000C500104DD5AFd0 ONLINE 0 0 0 > c8t5000C500104DD43Bd0 ONLINE 0 0 0 > c8t5000C500104DD78Bd0 ONLINE 0 0 0 > raidz2-6 ONLINE 0 0 0 > c8t5000C500104DDF17d0 ONLINE 0 0 0 > c8t5000C500104DE287d0 ONLINE 0 0 0 > c8t5000C500104E3BE7d0 ONLINE 0 0 0 > c8t5000C500104E3D83d0 ONLINE 0 0 0 > c8t5000C500104E3F9Fd0 ONLINE 0 0 0 > c8t5000C50010353C77d0 ONLINE 0 0 0 > logs > c3d0 ONLINE 0 0 0 > c4d0 ONLINE 0 0 0 > cache > c2d1 ONLINE 0 0 0 > spares > c8t5000C50020E3F617d0 INUSE currently in use > c8t5000C50021177453d0 AVAIL > c8t5000C5002117792Fd0 AVAIL > c8t5000C50021177297d0 AVAIL > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss