It occured to me that there are scenarios where it would be useful to be
able to "zfs send -i A B" where B is a snapshot older than A. I am
trying to design an encrypted disk-based off-site backup solution on top
of ZFS, where budget is the primary constraint, and I wish zfs send/recv
would allow me to do that. Here is why.

I have a server with 12 hot-swap disk bays. An "onsite" pool has been
created on 6 disks, where snapshots of the data to be backed up are
periodically taken. Two other "offsite" pools have been created on two
other sets of 6 disks, let's give them the names offsite-blue and
offsite-red (for use on blue/red, or even/odd, weeks). At least one of
the offsite pools is always at the off-site location, while the other
one is either in transit or in the server. Every week a script is
basically compressing and encrypting the last few snapshots (T-2, T-1,
T-0) from onsite to offsite-XXX. Here is an example:

  $ rm /offsite-blue/*
  $ zfs send        [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-2.full.gz.gpg
  $ zfs send -i T-2 [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-1.incr.gz.gpg
  $ zfs send -i T-1 [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-0.incr.gz.gpg

Then offsite-blue is zfs export'ed, sent to the the off-site location,
offsite-red is retrieved from the off-site location, sent back on-site,
ready to be used for the next week. My proof-of-concept tests show it
works OK, but 2 details are annoying:

  o In order to restore the latest snapshot T-0, all the zfs streams,
    T-2, T-1 and T-0, have to be decrypted, then zfs receive'd. It is
    slow and inconvenient.
  o My example only backs up the last 3 snapshots, but ideally I would
    like to fit as many as possible in the offsite pool. However, because
    of the unpredictable compression efficiency, I can't tell which
    snapshot I should start from when creating the first full stream.

These 2 problems would be non-existent if one could "zfs send -i A B"
with B older than A:

  $ zfs send        [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-0.full.gz.gpg
  $ zfs send -i T-0 [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-1.incr.gz.gpg
  $ zfs send -i T-1 [EMAIL PROTECTED] | gzip | gpg -c 
>/offsite-blue/T-2.incr.gz.gpg
  $ ... # continue forever, kill zfs(1m) when offsite-blue is 90% full

I have looked at the code and the restriction "B must be earlier than A"
is enforced in dmu_send.c:dmu_sendbackup() [1]. It looks like the code 
could be reworked to remove it.

Of course, when zfs-crypto ships, it will simplify a lot of things.
I could just always send incremental streams and receive them directly
on the encrypted pool, and directly manage the snapshots rotation by
zfs destroy'ing the old ones, etc.

[1] 
http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/uts/common/fs/zfs/dmu_send.c#232

-marc

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to