Freddie Cash <fjwc...@gmail.com> writes: > Kjetil Torgrim Homme <kjeti...@linpro.no> wrote: > > it would be inconvenient to make a dedup copy on harddisk or tape, > you could only do it as a ZFS filesystem or ZFS send stream. it's > better to use a generic tool like hardlink(1), and just delete > files afterwards with > > Why would it be inconvenient? This is pretty much exactly what ZFS + > dedupe is perfect for.
the duplication is not visible, so it's still a wilderness of duplicates when you navigate the files. > Since dedupe is pool-wide, you could create individual filesystems for > each DVD. Or use just 1 filesystem with sub-directories. Or just one > filesystem with snapshots after each DVD is copied over top. > > The data would be dedupe'd on write, so you would only have 1 copy of > unique data. for this application, I don't think the OP *wants* COW if he changes one file. he'll want the duplicates to be kept in sync, not diverging (in contrast to storage for VMs, for instance). with hardlinks, it is easier to identify duplicates and handle them however you like. if there is a reason for the duplicate access paths to your data, you can keep them. I would want to straighten the mess out, though, rather than keep it intact as closely as possible. > To save it to tape, just "zfs send" it, and save the stream file. the zfs stream format is not recommended for archiving. > ZFS dedupe would also work better than hardlinking files, as it works > at the block layer, and will be able to dedupe partial files. yes, but for the most part this will be negligible. copies of growing files, like log files, or perhaps your novel written as a stream of conciousness, will benefit. unrelated partially identical files are rare. -- Kjetil T. Homme Redpill Linpro AS - Changing the game _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss