>>>>> "et" == Erik Trimble <[EMAIL PROTECTED]> writes:
et> Dedup Advantages: et> (1) save space (2) coalesce data which is frequently used by many nodes in a large cluster into a small nugget of common data which can fit into RAM or L2 fast disk (3) back up non-ZFS filesystems that don't have snapshots and clones (4) make offsite replication easier on the WAN but, yeah, aside from imagining ahead to possible disastrous problems with the final implementation, the imagined use cases should probably be carefully compared to existing large installations. Firstly, dedup may be more tempting as a bulletted marketing feature or a bloggable/banterable boasting point than it is valuable to real people. Secondly, the comparison may drive the implementation. For example, should dedup happen at write time and be something that doesn't happen to data written before it's turned on, like recordsize or compression, to make it simpler in the user interface, and avoid problems with scrubs making pools uselessly slow? Or should it be scrub-like so that already-written filesystems can be thrown into the dedup bag and slowly squeezed, or so that dedup can run slowly during the business day over data written quickly at night (fast outside-business-hours backup)?
pgpHArHK13e1c.pgp
Description: PGP signature
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss