Howdy all, I too dabbled with dedup and found the performance poor with only 4gb ram. I've since disabled dedup and find the performance better but "zpool list" still shows a 1.15x dedup ratio. Is this still a hit on disk io performance? Aside from copying the data off and back onto the filesystem, is there another way to de-dedup the pool?
Thanks, John On Jun 13, 2010, at 10:17 PM, Erik Trimble wrote: > Hernan F wrote: >> Hello, I tried enabling dedup on a filesystem, and moved files into it to >> take advantage of it. I had about 700GB of files and left it for some hours. >> When I returned, only 70GB were moved. >> >> I checked zpool iostat, and it showed about 8MB/s R/W performance (the old >> and new zfs filesystems are in the same pool). So I disabled dedup for a few >> seconds and instantly the performance jumped to 80MB/s >> >> It's Athlon64 x2 machine with 4GB RAM, it's only a fileserver (4x1TB SATA >> for ZFS). arcstat.pl shows 2G for arcsz, top shows 13% CPU during the 8MB/s >> transfers. >> Is this normal behavior? Should I always expect such low performance, or is >> there anything wrong with my setup? >> Thanks in advance, >> Hernan >> > You are severely RAM limited. In order to do dedup, ZFS has to maintain a > catalog of every single block it writes and the checksum for that block. This > is called the Dedup Table (DDT for short). > So, during the copy, ZFS has to (a) read a block from the old filesystem, (b) > check the current DDT to see if that block exists and (c) either write the > block to the new filesytem (and add an appropriate DDT entry for it), or > write a metadata update with the dedup reference block reference. > > Likely, you have two problems: > > (1) I suspect your source filesystem has lots of blocks (that is, it's likely > made up smaller-sized files). Lots of blocks means lots of seeking back and > forth to read all those blocks. > > (2) Lots of blocks also means lots of entries in the DDT. It's trivial to > overwhelm a 4GB system with a large DDT. If the DDT can't fit in RAM, then > it has to get partially refreshed from disk. > > Thus, here's what's likely going on: > > (1) ZFS reads a block and it's checksum from the old filesystem > (2) it checks the DDT to see if that checksum exists > (3) finding that the entire DDT isn't resident in RAM, it starts a cycle to > read the rest of the (potential) entries from the new filesystems' metadata. > That is, it tries to reconstruct the DDT from disk. Which involves a HUGE > amount of random seek reads on the new filesystem. > > In essence, since you likely can't fit the DDT in RAM, each block read from > the old filesystem forces a flurry of reads from the new filesystem. Which > eats up the IOPS that your single pool can provide. It thrashes the disks. > Your solution is to either buy more RAM, or find something you can use as an > L2ARC cache device for your pool. Ideally, it would be an SSD. However, in > this case, a plain hard drive would do OK (NOT one already in a pool). To > add such a device, you would do: 'zpool add tank mycachedevice' > > > > > -- > Erik Trimble > Java System Support > Mailstop: usca22-123 > Phone: x17195 > Santa Clara, CA > Timezone: US/Pacific (GMT-0800) > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss