Re: [zfs-discuss] The importance of ECC RAM for ZFS

Victor Latushkin Fri, 31 Jul 2009 11:28:53 -0700

On 31.07.09 22:04, Kurt Olsen wrote:

On Jul 24, 2009, at 22:17, Bob Friesenhahn wrote:
....
Most of the issues that I've read on this list would
have been"solved" if there was a mechanism where the user /sysadmin could tellZFS to simply go back until it found a TXG that
worked.
The trade off is that any transactions (and their
data) after theworking one would be lost. But at least you're notleft with an un-importable pool.
I'm curious as to why people think rolling back txgs don't come with
additional costs beyond losing recent transactions. What are the odds
that the data blocks that were replaced by the discarded transactions
haven't been overwritten?

Odds depend on lots of factors - activity in the pool, free space, blockselection policy, metaslab cursor positions etc. I have seen examples ofsuccessful recovery to a point in time which is around 9 hours beforelast synced txg. Sometimes it is enough to roll one txg back, sometimesit requires going back and trying a a few older ones.

Without a snapshot to hold the references
aren't those blocks considered free and available for reuse?

As soon as transaction group is synced, blocks freed during thattransaction group time are released back to the pool, and potentiallycan be overwritten during next txg.

Don't get me wrong, I do think that rolling back to previous
uberblocks should be an option v. total pool loss, but it doesn't
seem like one can reliably say that their data is in some known good
state.

If fact thanks to the fact that everything is checksummed one can saythat pool is in a good shape as reliably as current checksum in use allows.


victor

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] The importance of ECC RAM for ZFS

Reply via email to