On Thu, 17 Dec 2009, Kjetil Torgrim Homme wrote:

compression requires CPU, actually quite a lot of it.  even with the
lean and mean lzjb, you will get not much more than 150 MB/s per core or
something like that.  so, if you're copying a 10 GB image file, it will
take a minute or two, just to compress the data so that the hash can be
computed so that the duplicate block can be identified.  if the dedup
hash was based on uncompressed data, the copy would be limited by
hashing efficiency (and dedup tree lookup).

It is useful to keep in mind that dedupication can save a lot of disk space but it is usually only quite effective in certain circumstances, such as when replicating a collection of files. The majority of write I/O will never benefit from deduplication. Based on this, speculatively assuming that the data will not be deduplicated does not result in increased cost most of the time. If the data does end up being deduplicated, then that is a blessing.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,    http://www.GraphicsMagick.org/
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to