On 03/ 2/10 11:48 AM, Freddie Cash wrote:
On Tue, Mar 2, 2010 at 7:15 AM, Kjetil Torgrim Homme <kjeti...@linpro.no <mailto:kjeti...@linpro.no>> wrote:

    "valrh...@gmail.com <mailto:valrh...@gmail.com>"
    <valrh...@gmail.com <mailto:valrh...@gmail.com>> writes:

    > I have been using DVDs for small backups here and there for a decade
    > now, and have a huge pile of several hundred. They have a lot of
    > overlapping content, so I was thinking of feeding the entire stack
    > into some sort of DVD autoloader, which would just read each
    disk, and
    > write its contents to a ZFS filesystem with dedup enabled. [...]
    That
    > would allow me to consolidate a few hundred CDs and DVDs onto
    probably
    > a terabyte or so, which could then be kept conveniently on a hard
    > drive and archived to tape.

    it would be inconvenient to make a dedup copy on harddisk or tape, you
    could only do it as a ZFS filesystem or ZFS send stream.  it's
    better to
    use a generic tool like hardlink(1), and just delete files afterwards
    with

Why would it be inconvenient? This is pretty much exactly what ZFS + dedupe is perfect for.

Since dedupe is pool-wide, you could create individual filesystems for each DVD. Or use just 1 filesystem with sub-directories. Or just one filesystem with snapshots after each DVD is copied over top.

The data would be dedupe'd on write, so you would only have 1 copy of unique data.

To save it to tape, just "zfs send" it, and save the stream file.

Stream dedup is largely independent of on-disk dedup. If the content is dedup'ed on disk, but you don't specify the -D to 'zfs send', the dedup'ed data will be re-expanded. Even if the content is NOT dedup'ed on disk, the -D option will cause the blocks to be dedup'ed in the stream.

One advantage to using them both is that the 'zfs send -D' processing doesn't need to recalculate the block checksums if they already exist on disk. This speeds up the send stream generation code by a lot.

Also, in response to another comment about the send stream format not being recommended for archiving, that all depends on how you intend to use the send stream in the future. The format IS supported going forward, and future version of zfs will continue to be capable of reading older send stream formats (the zfs(1M) man page has been modified to clarify this now).

Lori


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to