On 31.07.09 22:04, Kurt Olsen wrote:
On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
....
Most of the issues that I've read on this list would
have been "solved" if there was a mechanism where the user / sysadmin could tell ZFS to simply go back until it found a TXG that
worked.

The trade off is that any transactions (and their
data) after the working one would be lost. But at least you're not left with an un- importable pool.

I'm curious as to why people think rolling back txgs don't come with
additional costs beyond losing recent transactions. What are the odds
that the data blocks that were replaced by the discarded transactions
haven't been overwritten?

Odds depend on lots of factors - activity in the pool, free space, block selection policy, metaslab cursor positions etc. I have seen examples of successful recovery to a point in time which is around 9 hours before last synced txg. Sometimes it is enough to roll one txg back, sometimes it requires going back and trying a a few older ones.

Without a snapshot to hold the references
aren't those blocks considered free and available for reuse?

As soon as transaction group is synced, blocks freed during that transaction group time are released back to the pool, and potentially can be overwritten during next txg.

Don't get me wrong, I do think that rolling back to previous
uberblocks should be an option v. total pool loss, but it doesn't
seem like one can reliably say that their data is in some known good
state.

If fact thanks to the fact that everything is checksummed one can say that pool is in a good shape as reliably as current checksum in use allows.

victor

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to