Re: [zfs-discuss] Debunking the dedup memory myth

Richard Elling Sat, 10 Jul 2010 05:26:20 -0700

On Jul 9, 2010, at 11:10 PM, Brandon High wrote:

> On Fri, Jul 9, 2010 at 5:18 PM, Brandon High <bh...@freaks.com> wrote:
> I think that DDT entries are a little bigger than what you're using. The size 
> seems to range between 150 and 250 bytes depending on how it's calculated, 
> call it 200b each. Your 128G dataset would require closer to 200M (+/- 25%) 
> for the DDT if your data was completely unique. 1TB of unique data would 
> require 600M - 1000M for the DDT.
> 
> Using 376b per entry, it's 376M for 128G of unique data, or just under 3GB 
> for 1TB of unique data.


4% seems to be a pretty good SWAG.

> A 1TB zvol with 8k blocks would require almost 24GB of memory to hold the 
> DDT. Ouch.

... or more than 300GB for 512-byte records.

The performance issue is that DDT access tends to be random. This implies that
if you don't have a lot of RAM and your pool has poor random read I/O 
performance,
then you will not be impressed with dedup performance. In other words, trying to
dedup lots of data on a small DRAM machine using big, slow pool HDDs will not 
set
any benchmark records. By contrast, using SSDs for the pool can demonstrate good
random read performance. As the price per bit of HDDs continues to drop, the 
value
of deduping pools using HDDs also drops.
 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
ZFS and NexentaStor training, Rotterdam, July 13-15, 2010
http://nexenta-rotterdam.eventbrite.com/




_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Debunking the dedup memory myth

Reply via email to