Re: [zfs-discuss] Kernel panic at zpool import

Miles Nordin Thu, 14 Aug 2008 10:59:41 -0700

>>>>> "mb" == Marc Bevand <[EMAIL PROTECTED]> writes:


    mb> Ask your hardware vendor. The hardware corrupted your data,
    mb> not ZFS.

You absolutely do NOT have adequate basis to make this statement.

I would further argue that you are probably wrong, and that I think
based on what we know that the pool was probably corrupted by a bug in
ZFS.  Simply because ZFS is (a) able to detect problems with hardware
when they exist, and (b) ringing an alarm bell of some sort, does NOT
exhonerate ZFS.  and AIUI that is your position.

Further, ZFS's ability to use zpool-level redundancy heal problems
created by its own bugs is not a cause for celebration or an
improvement over filesystems without bugs.  The virtue of the
self-healing is for when hardware actually does fail.  If self-healing
also helps with corruption created by bugs in ZFS, that does not shift
blame for unhealed bug-corruption back to the hardware, nor make ZFS
more robust than a different filesystem without corruption bugs.

    mb> Other filesystems would have returned silently corrupted
    mb> data and it would have maybe taken you days/weeks to
    mb> troubleshoot

possibly.  very likely, other filesystems would have handled it fine.

Boris, have a look at the two links I posted earlier about ``simon
sez, import!'' incantations, and required patches.

  http://opensolaris.org/jive/message.jspa?messageID=192572#194209
  http://sunsolve.sun.com/search/document.do?assetkey=1-66-233602-1

panic-on-import, sounds a lot like your problem.  Jonathan also posted
http://www.opensolaris.org/jive/thread.jspa?messageID=220125 which
seems to be incomplete instructions on how to choose a different
ueberblock which helped someone else with a corrupted pool, but the OP
in that thread never wrote it up in recipe form for ignorant sysadmins
like me to follow so it might not be widely useful.

In short, ZFS is unstable and prone to corruption, but may improve
substantially when patched up to the latest revision.  And many fixes
are available now, but some which are in SXCE right now will be
available in the stable binary-only Solaris not until u6 so we haven't
yet gained experience with how much improvement the patches provide.
And finally, there is no way to back up a ZFS filesystem with lots of
clones which is similarly robust to past Unix backup systems---your
best bet for space-efficient backups is to zfs send/recv data onto a
separate ZFS pool.

In more detail, I think there is some experience here that when a
single storage subsystem hosting both ZFS pools and vxfs filesystems
goes away, ZFS pools sometimes become corrupt while vxfs rolls its log
and continues.  so, in stable Sol10u5, ZFS is probably more prone to
metadata corruption causing whole-pool-failure than other logging
filesystems.  some fixes are around the corner, and others are
apparently the subject of some philosophical debate.

pgpWGngZltSqj.pgp
Description: PGP signature

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Kernel panic at zpool import

Reply via email to