On Jul 9, 2010, at 11:10 PM, Brandon High wrote: > On Fri, Jul 9, 2010 at 5:18 PM, Brandon High <bh...@freaks.com> wrote: > I think that DDT entries are a little bigger than what you're using. The size > seems to range between 150 and 250 bytes depending on how it's calculated, > call it 200b each. Your 128G dataset would require closer to 200M (+/- 25%) > for the DDT if your data was completely unique. 1TB of unique data would > require 600M - 1000M for the DDT. > > Using 376b per entry, it's 376M for 128G of unique data, or just under 3GB > for 1TB of unique data.
4% seems to be a pretty good SWAG. > A 1TB zvol with 8k blocks would require almost 24GB of memory to hold the > DDT. Ouch. ... or more than 300GB for 512-byte records. The performance issue is that DDT access tends to be random. This implies that if you don't have a lot of RAM and your pool has poor random read I/O performance, then you will not be impressed with dedup performance. In other words, trying to dedup lots of data on a small DRAM machine using big, slow pool HDDs will not set any benchmark records. By contrast, using SSDs for the pool can demonstrate good random read performance. As the price per bit of HDDs continues to drop, the value of deduping pools using HDDs also drops. -- richard -- Richard Elling rich...@nexenta.com +1-760-896-4422 ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss