Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

Erik Trimble Sat, 07 May 2011 17:20:35 -0700

On 5/7/2011 6:47 AM, Edward Ned Harvey wrote:

See below.  Right around 400,000 blocks, dedup is suddenly an order of
magnitude slower than without dedup.


400000          10.7sec         136.7sec        143 MB          195

MB

800000          21.0sec         465.6sec        287 MB          391

MB

The interesting thing is - In all these cases, the complete DDT and the
complete data file itself should fit entirely in ARC comfortably.  So it
makes no sense for performance to be so terrible at this level.

So I need to start figuring out exactly what's going on.  Unfortunately I
don't know how to do that very well.  I'm looking for advice from anyone -
how to poke around and see how much memory is being consumed for what
purposes.  I know how to lookup c_min and c and c_max...  But that didn't do
me much good.  The actual value for c barely changes at all over time...
Even when I rm the file, c does not change immediately.

All the other metrics from kstat ... have less than obvious names ... so I
don't know what to look for...

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Some minor issues that might affect the above:

(1) I'm assuming you run your script repeatedly in the same pool,without deleting the pool. If that is the case, that means that a run ofX+1 should dedup completely with the run of X. E.g. a run with 120000blocks will dedup the first 110000 blocks with the prior run of 110000.

(2) can you NOT enable "verify" ? Verify *requires* a disk read beforewriting for any potential dedup-able block. If case #1 above applies,then by turning on dedup, you *rapidly* increase the amount of disk I/Oyou require on each subsequent run. E.g. the run of 100000 requires nodisk I/O due to verify, but the run of 110000 requires 100000 I/Orequests, while the run of 120000 requires 110000 requests, etc. Thiswill skew your results as the ARC buffering of file info changes over time.

(3) fflush is NOT the same as fsync. If you're running the script in aloop, it's entirely possible that ZFS hasn't completely committed thingsto disk yet, which means that you get I/O requests to flush out the ARCwrite buffer in the middle of your runs. Honestly, I'd do thefollowing for benchmarking:


        i=0
        while [i -lt 80 ];
        do
            j = $[100000 + ( 1  * 10000)]
            ./run_your_script j
            sync
            sleep 10
            i = $[$i+1]
    done



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Summary: Dedup and L2ARC memory requirements

Reply via email to