I'm not sure about *docs*, but my rough estimations:

Assume 1TB of actual used storage.  Assume 64K block/slab size.  (Not
sure how realistic that is -- it depends totally on your data set.)
Assume 300 bytes per DDT entry.

So we have (1024^4 / 65536) * 300 = 5033164800 or about 5GB RAM for one
TB of used disk space.

Dedup is *hungry* for RAM.  8GB is not enough for your configuration,
most likely!  First guess: double the RAM and then you might have better
luck.

The other takeaway here: dedup is the wrong technology for typical small
home server (e.g. systems that max out at 4 or even 8 GB).

Look into compression and snapshot clones as better alternatives to
reduce your disk space needs without incurring the huge RAM penalties
associated with dedup.

Dedup is *great* for a certain type of data set with configurations that
are extremely RAM heavy.  For everyone else, its almost universally the
wrong solution.  Ultimately, disk is usually cheaper than RAM -- think
hard before you enable dedup -- are you making the right trade off?

        - Garrett

On Sun, 2011-01-30 at 22:53 +0100, Roy Sigurd Karlsbakk wrote:
> Hi all
> 
> As I've said here on the list a few times earlier, the last on the thread 
> 'ZFS not usable (was ZFS Dedup question)', I've been doing some rather 
> thorough testing on zfs dedup, and as you can see from the posts, it wasn't 
> very satisfactory. The docs claim 1-2GB memory usage per terabyte stored, ARC 
> or L2ARC, but as you can read from the post, I don't find this very likely.
> 
> So, is there anyone in here using dedup for large storage (2TB? 10TB? more?) 
> and can document sustained high performance?
> 
> The reason I ask, is if this is the case, something is badly wrong with my 
> test setup.
> 
> The test box is a supermicro thing with a Core2duo CPU, 8 gigs of RAM, 4 gigs 
> of mirrored SLOG and some 150 gigs of L2ARC on 80GB x25-M drives. The data 
> drives are 7 2TB drives in RAIDz2. We're getting down to 10-20MB/s on Bacula 
> backup to this system, meaning streaming, which should be good for RAIDz2. 
> Since the writes are local (bacula-sd running), async writes will be the main 
> thing. Initial results show pretty good I/O perfrmance, but after about 2TB 
> used, the I/O speed is down to the numbers I mentioned
> 
> PS: I know those drives aren't optimal for this, but the box is a year old or 
> so. Still, they should help out a bit.
> 
> Vennlige hilsener / Best regards
> 
> roy
> --
> Roy Sigurd Karlsbakk
> (+47) 97542685
> r...@karlsbakk.net
> http://blogg.karlsbakk.net/
> --
> I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det 
> er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
> idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
> relevante synonymer på norsk.
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to