Re: [zfs-discuss] zfs corruptions in pool

Toby Thain Tue, 08 Jun 2010 18:06:04 -0700


On 6-Jun-10, at 7:11 AM, Thomas Maier-Komor wrote:

On 06.06.2010 08:06, devsk wrote:
I had an unclean shutdown because of a hang and suddenly my pool isdegraded (I realized something is wrong when python dumped core acouple of times).
This is before I ran scrub:

 pool: mypool
state: DEGRADED
status: One or more devices has experienced an error resulting indata
       corruption.  Applications may be affected.
action: Restore the file in question if possible. Otherwiserestore the
       entire pool from backup.
  see: http://www.sun.com/msg/ZFS-8000-8A
scan: scrub repaired 0 in 0h7m with 0 errors on Mon May 31 09:00:272010
config:

       NAME        STATE     READ WRITE CKSUM
       mypool      DEGRADED     0     0     0
         c6t0d0s0  DEGRADED     0     0     0  too many errors

errors: Permanent errors have been detected in the following files:

       mypool/ROOT/May25-2010-Image-Update:<0x3041e>
       mypool/ROOT/May25-2010-Image-Update:<0x31524>
       mypool/ROOT/May25-2010-Image-Update:<0x26d24>
       mypool/ROOT/May25-2010-Image-Update:<0x37234>
       //var/pkg/download/d6/d6be0ef348e3c81f18eca38085721f6d6503af7a
       mypool/ROOT/May25-2010-Image-Update:<0x25db3>
       //var/pkg/download/cb/cbb0ff02bcdc6649da3763900363de7cff78ec72
       mypool/ROOT/May25-2010-Image-Update:<0x26cf6>


I ran scrub and this is what it has to say afterwards.

 pool: mypool
state: DEGRADED
status: One or more devices has experienced an unrecoverableerror. Anattempt was made to correct the error. Applications areunaffected.action: Determine if the device needs to be replaced, and clear theerrorsusing 'zpool clear' or replace the device with 'zpoolreplace'.
  see: http://www.sun.com/msg/ZFS-8000-9P
scan: scrub repaired 0 in 0h11m with 0 errors on Sat Jun 522:43:54 2010
config:

       NAME        STATE     READ WRITE CKSUM
       mypool      DEGRADED     0     0     0
         c6t0d0s0  DEGRADED     0     0     0  too many errors

errors: No known data errors

Few of questions:
1. Have the errors really gone away? Can I just clear and becontent that errors are really gone?
2. Why did the errors occur anyway if ZFS guarantees on-diskconsistency? I wasn't writing anything. Those files were definitelynot being touched when the hang and unclean shutdown happened.
I mean I don't mind if I create or modify a file and it doesn'tland on disk because on unclean shutdown happened but a bunch ofunrelated files getting corrupted, is sort of painful to digest.
3. The action says "Determine if the device needs to be replaced".How the heck do I do that?
Is it possible that this system runs on a virtual box? At least I've
seen such a thing happen on a Virtual Box but never on a real machine.


As I postulated in the relevant forum thread there:
http://forums.virtualbox.org/viewtopic.php?t=13661
(can't check URL, the site seems down for me atm)

The reason why the error have gone away might be that meta data has
three copies IIRC. So if your disk only had corruptions in the metadata
area these errors can be repaired by scrubbing the pool.
The smartmontools might help you figuring out if the disk is broken.Butif you only had an unexpected shutdown and now everything is cleanafter
a scrub, I wouldn't expect the disk to be broken. You can get the
smartmontools from opencsw.org.
If your system is really running on a Virtual Box I'd recommend thatyou
turn of disk write caching of Virtual Box.

Specifically, stop it from ignoring cache flush. Caching is irrelevantif flushes are being correctly handled.

ZFS isn't the only software system that will suffer inconsistencies/corruption in the guest if flushes are ignored, of course.


--Toby

Search the OpenSolaris forum
of Virtual Box. There is an article somewhere how to do this. IIRC the
subject is somethink like 'zfs pool curruption'. But it is also
somewhere in the docs.

HTH,
Thomas
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] zfs corruptions in pool

Reply via email to