I had an unclean shutdown because of a hang and suddenly my pool is degraded (I 
realized something is wrong when python dumped core a couple of times).

This is before I ran scrub:

  pool: mypool
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scan: scrub repaired 0 in 0h7m with 0 errors on Mon May 31 09:00:27 2010
config:

        NAME        STATE     READ WRITE CKSUM
        mypool      DEGRADED     0     0     0
          c6t0d0s0  DEGRADED     0     0     0  too many errors

errors: Permanent errors have been detected in the following files:

        mypool/ROOT/May25-2010-Image-Update:<0x3041e>
        mypool/ROOT/May25-2010-Image-Update:<0x31524>
        mypool/ROOT/May25-2010-Image-Update:<0x26d24>
        mypool/ROOT/May25-2010-Image-Update:<0x37234>
        //var/pkg/download/d6/d6be0ef348e3c81f18eca38085721f6d6503af7a
        mypool/ROOT/May25-2010-Image-Update:<0x25db3>
        //var/pkg/download/cb/cbb0ff02bcdc6649da3763900363de7cff78ec72
        mypool/ROOT/May25-2010-Image-Update:<0x26cf6>


I ran scrub and this is what it has to say afterwards.

  pool: mypool
 state: DEGRADED
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scan: scrub repaired 0 in 0h11m with 0 errors on Sat Jun  5 22:43:54 2010
config:

        NAME        STATE     READ WRITE CKSUM
        mypool      DEGRADED     0     0     0
          c6t0d0s0  DEGRADED     0     0     0  too many errors

errors: No known data errors

Few of questions:

1. Have the errors really gone away? Can I just clear and be content that 
errors are really gone?

2. Why did the errors occur anyway if ZFS guarantees on-disk consistency? I 
wasn't writing anything. Those files were definitely not being touched when the 
hang and unclean shutdown happened.

I mean I don't mind if I create or modify a file and it doesn't land on disk 
because on unclean shutdown happened but a bunch of unrelated files getting 
corrupted, is sort of painful to digest.

3. The action says "Determine if the device needs to be replaced". How the heck 
do I do that?
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to