Bayard, Indeed, you did answer it - and thanks for getting back to me - your suggestion was spot ON!
However, the simple zpool clear/scrub cycle wouldn't work in our case - at least initially. In fact, after multiple 'rinse/repeats', the offending file - or its hex representation - would reappear. In fact, the CHSKUM errors would often mount... Logically, this seems to make some sense; that zfs would attempt to reconstitute the damaged file with each scrub...(?) In any case, after gathering the nerve to start deleting old snapshots - including the one with the offending file - the clear/scrub process worked a charm. Many thanks again! Lou Picciano ----- Original Message ----- From: "Bayard G. Bell" <buffer.g.overf...@gmail.com> To: z...@lists.illumos.org Cc: zfs-discuss@opensolaris.org Sent: Sunday, January 29, 2012 3:22:39 PM Subject: Re: [zfs] Oddly-persistent file error on ZFS root pool Lou, Tried to answer this when you asked on IRC. Try a zpool clear and scrub again to see if the errors persist. Cheers, Bayard On Sat, 2012-01-28 at 17:52 +0000, Lou Picciano wrote: > > > > Hello ZFS wizards, > > Have an odd ZFS problem I'd like to run by you - > > Root pool on this machine is a 'simple' mirror - just two disks. # zpool > status > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 3 > mirror-0 ONLINE 0 0 6 > c2t0d0s0 ONLINE 0 0 6 > c2t1d0s0 ONLINE 0 0 6 > > errors: Permanent errors have been detected in the following files: > > rpool/ROOT/openindiana-userland-154@zfs-auto-snap_monthly-2011-11-22-09h19:/etc/svc/repository-boot-tmpEdaGba > > > ... or similar; CKSUM counts have varied, but were always in that 1x - 2x , > 'symmetrical' pattern. > > After working through the problems above, scrubbing and zfs destroying the > snapshot with 'permanent errors', the CKSUMS clear up, but vestiges of the > file remain as hex addresses: > > NAME STATE READ WRITE CKSUM > rpool ONLINE 0 0 0 > mirror-0 ONLINE 0 0 0 > c2t0d0s0 ONLINE 0 0 0 > c2t1d0s0 ONLINE 0 0 0 > > errors: Permanent errors have been detected in the following files: > > <0x18e73>:<0x78007> > > I have no evidence that ZFS is itself the direct culprit here; it may just be > on the receiving end of one of the couple of problems we've recently worked > through on this machine: > 1. a defective CPU, managed by the fault manager, but without a > fully-configured crashdump (now rectified), then > 2. the SandyBridge 'interrupt storm' problem, which we seem to have now > worked around. > > The storage pools are scrubbed pretty regularly, and we generally have no > cksum errors at all. At one point, vmstat reported 7+ _million+ interrupt > faults over 5 seconds! I've attempted to clear stats on the pool as well > (didn't expect this to work, but worth a try, right?) > > Important to note that Memtest+ had been run, last time for ~14 hrs, with no > error reported. > > Don't think the storage controller is the culprit, either, as _all_ drives > are controlled by the P67A - and no other problems seen. And no errors > reported via smartctl. > > Would welcome input from two perspectives: > > 1) Before I rebuild the pool/reinstall/whatever, is anyone here interested in > any diagnostic output which might still be available? Is any of this useful > as a bug report? > 2) Then, would love to hear ideas on a solution. > > Proposed solutions include: > 1) creating new BE based on snap of root pool: > - Snapshot root pool > - (zfs send to datapool for safekeeping) > - Split rpool > - zpool create newpool (on Drive 'B') > - beadm -p create newpool NEWboot (being sure to use slice 0 of Drive 'B') > > 2) Simply deleting _all_ snapshots on the rpool. > > 3) complete re-install > > Tks for feedback. Lou Picciano > > > > ------------------------------------------- > illumos-zfs > Archives: https://www.listbox.com/member/archive/182191/=now > RSS Feed: https://www.listbox.com/member/archive/rss/182191/22062040-29ecd758 > Modify Your Subscription: https://www.listbox.com/member/?& > Powered by Listbox: http://www.listbox.com ------------------------------------------- illumos-zfs Archives: https://www.listbox.com/member/archive/182191/=now RSS Feed: https://www.listbox.com/member/archive/rss/182191/22086598-09fa5b64 Modify Your Subscription: https://www.listbox.com/member/?member_id=22086598&id_secret=22086598-86c7d407 Powered by Listbox: http://www.listbox.com
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss