Today at 09:44, Garrett D'Amore <garr...@nexenta.com> wrote:


Those corrupt files are corrupt forever. Until they are removed.  I
recommend doing a scrub.  There are probably other experts here
(Richard?) who can suggest a permanent fix.


Right, and we're OK with that.. We were lucky - all of the corrupt files are non-essential. When I remove the files and replace them, I get something that looks like a hex device:block number (ie: <0x86>:<0x38fcd>).

The server is a v440, so I was able to have someone add some extra drives.
zpool replace zroot c1t1d0s2 c1t2d0s2 failed to complete.. it left zpool status looking like this:
10:40:33 catalina(36)> sudo zpool status -v
Password:
  pool: zroot
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
scrub: resilver completed after 0h31m with 4 errors on Tue Jul 13 11:47:07 2010
config:

        NAME            STATE     READ WRITE CKSUM
        zroot           DEGRADED    28     0     0
          mirror        DEGRADED    60     0    27
            replacing   DEGRADED    31     0    53
              c1t1d0s2  DEGRADED   129     0    99  too many errors
              c1t2d0s2  ONLINE       0     0    84  24.5G resilvered
            c1t0d0s2    ONLINE       0     0    87  24.4G resilvered

errors: Permanent errors have been detected in the following files:

        //usr/dt/lib/sparcv9/libDtWidget.so.2
        //platform/sun4us/failsafe
        //opt/staroffice8/share/gallery/www-graf/bluleft.gif

/var/tmp/patches/10_Recommended/125541-04/SUNWthunderbird/reloc/lib/thunderbird/components/librdf.so


If I delete one of these files, zpool status -v shows that device/block identifier I mentioned previously.. I've run a few scrubs, but they don't change anything.


The system appears stable right now, our internal customers have no idea anything is wrong (IE, their apps are stable). We're planning on migrating them to a Niagara blade to return things to "known good".


I'm still curious to know if anyone knows a fix for this kind of issue, if there is one. I fully expect that if I was running UFS on one drive and it failed like this zfs drive failed the system would have panicked. That's a big win. I would still like to get to the bottom of this issue. :-)


Thanks again for your replies.

--Kris







Today at 16:15, Garrett D'Amore <garr...@nexenta.com> wrote:

Hey Kris (glad to see someone from my QCOM days!):

It should automatically clear itself when you replace the disk.  Right
now you're still degraded since you don't have full redundancy.

        - Garrett


On Mon, 2010-07-12 at 16:10 -0700, Kris Kasner wrote:
Hi Folks..

I have a system that was inadvertently left unmirrored for root. We were able
to add a mirror disk, resilver, and fix the corrupted files (nothing very
interesting was corrupt, whew), but zpool status -v still shows errors..

Will this self correct when we replace the degraded disk and resilver? Or is
there something else that I'm not finding that I need to do to clean up?

This is Solaris 10 u8, zpool v15
15:52:50 catalina(34)> sudo zpool status -v
   pool: zroot
  state: DEGRADED
status: One or more devices has experienced an error resulting in data
         corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
         entire pool from backup.
    see: http://www.sun.com/msg/ZFS-8000-8A
  scrub: resilver completed after 0h48m with 15 errors on Mon Jul 12 15:41:50
2010
config:

         NAME          STATE     READ WRITE CKSUM
         zroot         DEGRADED    18     0     0
           mirror      DEGRADED    44     0    23
             c1t1d0s2  DEGRADED    74     0    23  too many errors
             c1t0d0s2  ONLINE       0     0    67  29.8G resilvered

errors: Permanent errors have been detected in the following files:

         zroot/packages:<0xad58>
         zroot/packages:<0x11477>
         zroot/packages:<0x2531d>
         <0x6e>:<0xc0f2>
         <0x6e>:<0xce68>
         <0x6e>:<0x28d9f>
         <0x6e>:<0x2b5c1>
         <0x76>:<0x17369>
         <0x86>:<0x11fda>
         <0x86>:<0x13253>
         <0x86>:<0x13346>
         <0x86>:<0x33ed3>
         <0x86>:<0x38fcd>
         <0x86>:<0x39007>
15:53:04 catalina(35)>


Thanks for any suggestions. The system is in another city, so I can't quickly
test replacing the disk and see what happens..

Kris







--

Thomas Kris Kasner
Qualcomm Inc.
5775 Morehouse Drive
San Diego, CA 92121
(858)658-4932


Outside of a dog, A book is man's best friend.
        Inside of a dog... It's too dark to read! (unknown)
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to