Re: [zfs-discuss] Dedup performance hit

John J Balestrini Sun, 13 Jun 2010 22:53:19 -0700

Howdy all,

I too dabbled with dedup and found the performance poor with only 4gb ram. I've 
since disabled dedup and find the performance better but "zpool list" still 
shows a 1.15x dedup ratio. Is this still a hit on disk io performance? Aside 
from copying the data off and back onto the filesystem, is there another way to 
de-dedup the pool?


Thanks,

John



On Jun 13, 2010, at 10:17 PM, Erik Trimble wrote:

> Hernan F wrote:
>> Hello, I tried enabling dedup on a filesystem, and moved files into it to 
>> take advantage of it. I had about 700GB of files and left it for some hours. 
>> When I returned, only 70GB were moved.
>> 
>> I checked zpool iostat, and it showed about 8MB/s R/W performance (the old 
>> and new zfs filesystems are in the same pool). So I disabled dedup for a few 
>> seconds and instantly the performance jumped to 80MB/s
>> 
>> It's Athlon64 x2 machine with 4GB RAM, it's only a fileserver (4x1TB SATA 
>> for ZFS). arcstat.pl shows 2G for arcsz, top shows 13% CPU during the 8MB/s 
>> transfers. 
>> Is this normal behavior? Should I always expect such low performance, or is 
>> there anything wrong with my setup? 
>> Thanks in advance,
>> Hernan
>>  
> You are severely RAM limited.  In order to do dedup, ZFS has to maintain a 
> catalog of every single block it writes and the checksum for that block. This 
> is called the Dedup Table (DDT for short).  
> So, during the copy, ZFS has to (a) read a block from the old filesystem, (b) 
> check the current DDT to see if that block exists and (c) either write the 
> block to the new filesytem (and add an appropriate DDT entry for it), or 
> write a metadata update with the dedup reference block reference.
> 
> Likely, you have two problems:
> 
> (1) I suspect your source filesystem has lots of blocks (that is, it's likely 
> made up smaller-sized files).  Lots of blocks means lots of seeking back and 
> forth to read all those blocks.
> 
> (2) Lots of blocks also means lots of entries in the DDT.  It's trivial to 
> overwhelm a 4GB system with a large DDT.  If the DDT can't fit in RAM, then 
> it has to get partially refreshed from disk.
> 
> Thus, here's what's likely going on:
> 
> (1)  ZFS reads a block and it's checksum from the old filesystem
> (2)  it checks the DDT to see if that checksum exists
> (3) finding that the entire DDT isn't resident in RAM, it starts a cycle to 
> read the rest of the (potential) entries from the new filesystems' metadata.  
> That is, it tries to reconstruct the DDT from disk.  Which involves a HUGE 
> amount of random seek reads on the new filesystem.
> 
> In essence, since you likely can't fit the DDT in RAM, each block read from 
> the old filesystem forces a flurry of reads from the new filesystem. Which 
> eats up the IOPS that your single pool can provide.  It thrashes the disks.  
> Your solution is to either buy more RAM, or find something you can use as an 
> L2ARC cache device for your pool.  Ideally, it would be an SSD.  However, in 
> this case, a plain hard drive would do OK (NOT one already in a pool).    To 
> add such a device, you would do:  'zpool add tank mycachedevice'
> 
> 
> 
> 
> -- 
> Erik Trimble
> Java System Support
> Mailstop:  usca22-123
> Phone:  x17195
> Santa Clara, CA
> Timezone: US/Pacific (GMT-0800)
> 
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Dedup performance hit

Reply via email to