While reading about zfs on-disk formats, I wondered once again why is it not possible to create a snapshot on existing data, not of the current TXG but of some older point-in-time?
From what I gathered, definition of a snapshot requires the cut-off TXG number existence of some blocks in this dataset with smaller-or-equal TXG numbers. It seems like just a coincidence that current TXG is used and older TXGs aren't. Is it deemed inconvenient/unpractical/useless/didn't think of, or are there some fundamental or technological drawbacks to the idea? Note: this idea is related to my proposal in October thread "[zfs-discuss] (Incremental) ZFS SEND at sub-snapshot level" and could aid "restartable zfs-send" by creation of smaller snapshots for incremental sending of existing large datasets. Today I had a new twist on the idea, though: as I wrote in other posts, my raidz2 did not help protecting some of my data. One of the damaged files belongs to a stack of snapshots that are continually replicated from another box, and the inconsistent on-disk block is referenced in an old snapshot (almost at the root of stack). Resending and re-receiving the whole stack of snapshots is possible, but inconvenient and slow. RSyncing just the difference (good data instead of IO-Erroring byte range) to repair the file would forfeit further increnmental snapshot syncs. So I thought: it would be nice if it were possible (perhaps not now, but in the future as an RFE) to resend and replace just that snapshot in the middle or even root of the stack. Perhaps even better, with ZDB or some other tools I might determine which blocks have rottened and which TXG they belonged to, and I'd "fence" that TXG on the source and destination systems with proposed "injected snapshots". Older and newer snapshots around this TXG range would provide incremental changes to data, as they normally do, and I'd only quickly replace a small intermittent snapshot. All this needs is a couple of not-yet-existing features... PS: I think that this idea might even have some "business case" foundation for active-passive clusters with zfs send updating a passive cluster node. Whenever scrub on one of the systems finds an unrecoverable block in older data, the node might request "just it" from the other head. Likewise for backups to removable media, etc. If we already have a ZFS-based storage similar to an out-of-sync mirror, why not use the available knowledge of known-good blocks to repair detected {small} errors in large volumes of "same" data? What do you think?.. //Jim Klimov _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss