On Fri, Apr 29, 2011 at 1:23 PM, Freddie Cash <fjwc...@gmail.com> wrote:
> Is there anyway, yet, to import a pool with corrupted space_map
> errors, or "zio-io_type != ZIO_TYPE_WRITE" assertions?
>
> I have a pool comprised of 4 raidz2 vdevs of 6 drives each.  I have
> almost 10 TB of data in the pool (3 TB actual disk space used due to
> dedup and compression).  While testing various failure modes, I have
> managed to corrupt the pool to the point where it won't import.  So
> much for being bulletproof.  :(
>
> If I try to import the pool normally, it give corrupted space_map errors.
>
> If I try to "import -F" the pool, it complains that "zio-io_type !=
> ZIO_TYPE_WRITE".
>
> I've also tried the above with "-o readonly=on" and "-R
> some/other/root" variations.
>
> There's also no zfs.cache file anywhere to be found, and creating a
> blank file doesn't help.
>
> Does this mean that a 10 TB pool can be lost due to a single file
> being corrupted, or a single piece of pool metadata being corrupted?
> And that there's *still* no recovery tools for situations like this?
>
> Running ZFSv28 on 64-bit FreeBSD 8-STABLE.
>
> For the curious, the failure mode that causes this?  Rebooting while 8
> simultaneous rsyncs were running, which were not killed by the
> shutdown process for some reason, which prevented 8 ZFS filesystems
> from being unmounted, which prevented the pool from being exported
> (even though I have a "zfs unmount -f" and "zpool export -f"
> fail-safe), which locked up the shutdown process requiring a power
> reset.

Well, by commenting out the VERIFY line for zio->io_type !=
ZIO_TYPE_WRITE and compiling a new kernel, I can import the pool, but
only with -F and -o readonly=on.  :(

Trying to import it read-write gives dmu_free_range errors and panics
the system.

Compiling a kernel with that assertion commented out allows the pool
to be imported read-only.  Importing it read-write gives a bunch of
other dmu panics.  :( :( :(

How can it be that after 28 pool format revisions and 5+ years of
development, ZFS is still this brittle?  I've found lots of threads
from 2007 about this very issue, with "don't do that" and "it's not an
issue" and "there's no need for a pool consistency checker" and other
similar "head in the sand" responses.  :(  But still no way to prevent
or fix this form of corruption.

It's great that I can get the pool to import read-only, so the data is
still available.  But that really doesn't help when I've already
rebuilt this pool twice due to this issue.

-- 
Freddie Cash
fjwc...@gmail.com
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to