The endless resilver problem still persists on OI b147. Restarts when it should complete.
I see no other solution than to copy the data to safety and recreate the array. Any hints would be appreciated as that takes days unless i can stop or pause the resilvering. On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola <tuomas.leik...@gmail.com>wrote: > Hi! > > My home server had some disk outages due to flaky cabling and whatnot, and > started resilvering to a spare disk. During this another disk or two > dropped, and were reinserted into the array. So no devices were actually > lost, they just were intermittently away for a while each. > > The situation is currently as follows: > pool: tank > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are > unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > raidz1-0 ONLINE 0 0 0 > c11t1d0p0 ONLINE 0 0 0 > c11t2d0 ONLINE 0 0 5 > c11t6d0p0 ONLINE 0 0 0 > spare-3 ONLINE 0 0 0 > c11t3d0p0 ONLINE 0 0 0 106M > resilvered > c9d1 ONLINE 0 0 0 104G > resilvered > c11t4d0p0 ONLINE 0 0 0 > c11t0d0p0 ONLINE 0 0 0 > c11t5d0p0 ONLINE 0 0 0 > c11t7d0p0 ONLINE 0 0 0 93.6G > resilvered > raidz1-2 ONLINE 0 0 0 > c6t2d0 ONLINE 0 0 0 > c6t3d0 ONLINE 0 0 0 > c6t4d0 ONLINE 0 0 0 2.50K > resilvered > c6t5d0 ONLINE 0 0 0 > c6t6d0 ONLINE 0 0 0 > c6t7d0 ONLINE 0 0 0 > c6t1d0 ONLINE 0 0 1 > logs > /dev/zvol/dsk/rpool/log ONLINE 0 0 0 > cache > c6t0d0p0 ONLINE 0 0 0 > spares > c9d1 INUSE currently in use > > errors: No known data errors > > And this has been going on for a week now, always restarting when it should > complete. > > The questions in my mind atm: > > 1. How can i determine the cause for each resilver? Is there a log? > > 2. Why does it resilver the same data over and over, and not just the > changed bits? > > 3. Can i force remove c9d1 as it is no longer needed but c11t3 can be > resilvered instead? > > I'm running opensolaris 134, but the event originally happened on 111b. I > upgraded and tried quiescing snapshots and IO, none of which helped. > > I've already ordered some new hardware to recreate this entire array as > raidz2 among other things, but there's about a week of time when I can run > debuggers and traces if instructed to. > > - Tuomas > >
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss