My research into recovering from a pool whose slog goes MIA while the pool
is off-line resulted in two possible methods, one requiring prior
preparation and the other a copy of the zpool.cache including data for the
failed pool.

The first method is to simply dump a copy of the slog device right after
you make it (just dd if=/dev/dsk/<slog> of=slog.dump>). If the device ever
failed, theoretically you could restore the image onto a replacement (dd
if=slog.dump of=/dev/dsk/<slog>) and import the pool.

My initial testing of that method was promising, however that testing was
performed by intentionally corrupting the slog device, and restoring the
copy back onto the original device. However, when I tried restoring the
slog dump onto a different device, that didn't work out so well. zpool
import recognized the different device as a log device for the pool, but
still complained there were unknown missing devices and refused to import
the pool. It looks like the device serial number is stored as part of the
zfs label, resulting in confusion when that label is restored onto a
different device. As such, this method is only usable if the underlying
fault is simply corruption, and the original device is available to restore
onto.

The second method is described at:

        http://opensolaris.org/jive/thread.jspa?messageID=377018

Unfortunately, the included binary does not run under S10U6, and after half
an hour or so of trying to get the source code to compile under S10U6 I
gave up (I found some of the missing header files in the S10U6 grub source
code package which presumably match the actual data structures in use under
S10, but there was additional stuff missing which as I started copying it
out of opensolaris code just started getting messier and messier). Unless
someone with more zfs-fu than me creates a binary for S10, this approach is
not going to be viable.

Unofficially I was told that there is expected to be a fix for this issue
putback into Nevada around July, but whether or not that might be available
in U8 wasn't said. So, barring any official release of a fix or unofficial
availability of a workaround for S10, in the (admittedly unlikely) failure
mode of a slog device failure on an inactive pool, have good backups :).


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to