There's probably a way to clean up those old entries, I'm just not sure what it is. Is the data shared with any snapshots or clones? I'd expect you have to remove all references to the blocks, not just the files but also in snapshots or cloned images.
- Garrett On Thu, 2010-07-15 at 10:12 -0700, Kris Kasner wrote: > Today at 09:44, Garrett D'Amore <garr...@nexenta.com> wrote: > > > > > Those corrupt files are corrupt forever. Until they are removed. I > > recommend doing a scrub. There are probably other experts here > > (Richard?) who can suggest a permanent fix. > > > > Right, and we're OK with that.. We were lucky - all of the corrupt files are > non-essential. When I remove the files and replace them, I get something that > looks like a hex device:block number (ie: <0x86>:<0x38fcd>). > > The server is a v440, so I was able to have someone add some extra drives. > zpool replace zroot c1t1d0s2 c1t2d0s2 > failed to complete.. it left zpool status looking like this: > 10:40:33 catalina(36)> sudo zpool status -v > Password: > pool: zroot > state: DEGRADED > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: resilver completed after 0h31m with 4 errors on Tue Jul 13 11:47:07 > 2010 > config: > > NAME STATE READ WRITE CKSUM > zroot DEGRADED 28 0 0 > mirror DEGRADED 60 0 27 > replacing DEGRADED 31 0 53 > c1t1d0s2 DEGRADED 129 0 99 too many errors > c1t2d0s2 ONLINE 0 0 84 24.5G resilvered > c1t0d0s2 ONLINE 0 0 87 24.4G resilvered > > errors: Permanent errors have been detected in the following files: > > //usr/dt/lib/sparcv9/libDtWidget.so.2 > //platform/sun4us/failsafe > //opt/staroffice8/share/gallery/www-graf/bluleft.gif > > /var/tmp/patches/10_Recommended/125541-04/SUNWthunderbird/reloc/lib/thunderbird/components/librdf.so > > > If I delete one of these files, zpool status -v shows that device/block > identifier I mentioned previously.. I've run a few scrubs, but they don't > change anything. > > > The system appears stable right now, our internal customers have no idea > anything is wrong (IE, their apps are stable). We're planning on migrating > them > to a Niagara blade to return things to "known good". > > > I'm still curious to know if anyone knows a fix for this kind of issue, if > there is one. I fully expect that if I was running UFS on one drive and it > failed like this zfs drive failed the system would have panicked. That's a > big win. I would still like to get to the bottom of this issue. :-) > > > Thanks again for your replies. > > --Kris > > > > > > > >> > >> Today at 16:15, Garrett D'Amore <garr...@nexenta.com> wrote: > >> > >>> Hey Kris (glad to see someone from my QCOM days!): > >>> > >>> It should automatically clear itself when you replace the disk. Right > >>> now you're still degraded since you don't have full redundancy. > >>> > >>> - Garrett > >>> > >>> > >>> On Mon, 2010-07-12 at 16:10 -0700, Kris Kasner wrote: > >>>> Hi Folks.. > >>>> > >>>> I have a system that was inadvertently left unmirrored for root. We were > >>>> able > >>>> to add a mirror disk, resilver, and fix the corrupted files (nothing very > >>>> interesting was corrupt, whew), but zpool status -v still shows errors.. > >>>> > >>>> Will this self correct when we replace the degraded disk and resilver? > >>>> Or is > >>>> there something else that I'm not finding that I need to do to clean up? > >>>> > >>>> This is Solaris 10 u8, zpool v15 > >>>> 15:52:50 catalina(34)> sudo zpool status -v > >>>> pool: zroot > >>>> state: DEGRADED > >>>> status: One or more devices has experienced an error resulting in data > >>>> corruption. Applications may be affected. > >>>> action: Restore the file in question if possible. Otherwise restore the > >>>> entire pool from backup. > >>>> see: http://www.sun.com/msg/ZFS-8000-8A > >>>> scrub: resilver completed after 0h48m with 15 errors on Mon Jul 12 > >>>> 15:41:50 > >>>> 2010 > >>>> config: > >>>> > >>>> NAME STATE READ WRITE CKSUM > >>>> zroot DEGRADED 18 0 0 > >>>> mirror DEGRADED 44 0 23 > >>>> c1t1d0s2 DEGRADED 74 0 23 too many errors > >>>> c1t0d0s2 ONLINE 0 0 67 29.8G resilvered > >>>> > >>>> errors: Permanent errors have been detected in the following files: > >>>> > >>>> zroot/packages:<0xad58> > >>>> zroot/packages:<0x11477> > >>>> zroot/packages:<0x2531d> > >>>> <0x6e>:<0xc0f2> > >>>> <0x6e>:<0xce68> > >>>> <0x6e>:<0x28d9f> > >>>> <0x6e>:<0x2b5c1> > >>>> <0x76>:<0x17369> > >>>> <0x86>:<0x11fda> > >>>> <0x86>:<0x13253> > >>>> <0x86>:<0x13346> > >>>> <0x86>:<0x33ed3> > >>>> <0x86>:<0x38fcd> > >>>> <0x86>:<0x39007> > >>>> 15:53:04 catalina(35)> > >>>> > >>>> > >>>> Thanks for any suggestions. The system is in another city, so I can't > >>>> quickly > >>>> test replacing the disk and see what happens.. > >>>> > >>>> Kris > >>>> > >>> > >>> > >> > > > > > _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss