On Thu, 17 Dec 2009, Kjetil Torgrim Homme wrote:
compression requires CPU, actually quite a lot of it. even with the lean and mean lzjb, you will get not much more than 150 MB/s per core or something like that. so, if you're copying a 10 GB image file, it will take a minute or two, just to compress the data so that the hash can be computed so that the duplicate block can be identified. if the dedup hash was based on uncompressed data, the copy would be limited by hashing efficiency (and dedup tree lookup).
It is useful to keep in mind that dedupication can save a lot of disk space but it is usually only quite effective in certain circumstances, such as when replicating a collection of files. The majority of write I/O will never benefit from deduplication. Based on this, speculatively assuming that the data will not be deduplicated does not result in increased cost most of the time. If the data does end up being deduplicated, then that is a blessing.
Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss