On 05.06.10 00:10, Ray Van Dolson wrote:
On Fri, Jun 04, 2010 at 01:03:32PM -0700, Brandon High wrote:
On Fri, Jun 4, 2010 at 12:37 PM, Ray Van Dolson <rvandol...@esri.com> wrote:
Makes sense.  So, as someone else suggested, decreasing my block size
may improve the deduplication ratio.
It might. It might make your performance tank, too.

Decreasing the block size increases the size of the dedup table (DDT).
Every entry in the DDT uses somewhere around 250-270 bytes. If the DDT
gets too large to fit in memory, it will have to be read from disk,
which will destroy any sort of write performance (although a L2ARC on
SSD can help)

If you move to 64k blocks, you'll double the DDT size and may not
actually increase your ratio. Moving to 8k blocks will increase your
DDT by a factor of 16, and still may not help.

Changing the recordsize will not affect files that are already in the
dataset. You'll have to recopy them to re-write with the smaller block
size.

-B

Gotcha.  Just trying to make sure I understand how all this works, and
if I _would_ in fact see an improvement in dedupe-ratio by tweaking the
recordsize with our data-set.


You can use zdb -S to assess how effective deduplication can be without actually turning it on your pool.

regards
victor
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to