On 8/29/06, James Dickens <[EMAIL PROTECTED]> wrote:
ZFS + rsync, backup on steroids.


If you combine this with a de-duplication algorithm you could get
really space-efficient backups.

Suppose you have 100 (or 1000, or 10000) machines to back up that are
the same 3 GB OS image + mixed bag of apps + various prod/non-prod
copies of databases + per-machine customization.   Wouldn't it be nice
if the backup server would store figure out that each machine is
mostly the same and store one copy.  Perhaps having a mechanism that
it would store a per-block checksum in a database, then look for
matches by checksum (aka hash) each time a block is written.  Hash
collissions should be verified with full block compare.

Then you could create your restore procedure as a CGI or similar web
magic that generates a flar based upon the URL+args provided.  That
URL can then be used in a jumpstart profile as "archive_location
http://backupserver.mycompany.com/flar/...";.  A finish script would be
responsible for using rsync or simlar to copy the sysidcfg-related
files that jumpstart/flar refuses to preserve.

FWIW, de-duplication seems to be a hot topic in VTLs (Virtual Tape
Libraries).  This would be an awesome feature to have in ZFS, even if
the de-duplication happens as a later pass similar to zfs scrub.

Mike

--
Mike Gerdts
http://mgerdts.blogspot.com/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to