Re: [zfs-discuss] why both dedup and compression?

Richard L. Hamilton Wed, 05 May 2010 19:12:03 -0700

Another thought is this: _unless_ the CPU is the bottleneck on
a particular system, compression (_when_ it actually helps) can
speed up overall operation, by reducing the amount of I/O needed.
But storing already-compressed files in a filesystem with compression
is likely to result in wasted effort, with little or no gain to show for it.


Even deduplication requires some extra effort.  Looking at the documentation,
it implies a particular checksum algorithm _plus_ verification (if the checksum
or digest matches, then make sure by doing a byte-for-byte compare of the
blocks, since nothing shorter than the data itself can _guarantee_ that
they're the same, just like no lossless compression can possibly work for
all possible bitstreams).

So doing either of these where the success rate is likely to be too low
is probably not helpful.

There are stats that show the savings for a filesystem due to compression
or deduplication.  What I think would be interesting is some advice as to
how much (percentage) savings one should be getting to expect to come
out ahead not just on storage, but on overall system performance.  Of
course, no such guidance would exactly fit any particular workload, but
I think one might be able to come up with some approximate numbers,
or at least a range, below which those features probably represented
a waste of effort unless space was at an absolute premium.
-- 
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] why both dedup and compression?

Reply via email to