On 5/7/2011 6:47 AM, Edward Ned Harvey wrote:
See below.  Right around 400,000 blocks, dedup is suddenly an order of
magnitude slower than without dedup.

400000          10.7sec         136.7sec        143 MB          195
MB
800000          21.0sec         465.6sec        287 MB          391
MB

The interesting thing is - In all these cases, the complete DDT and the
complete data file itself should fit entirely in ARC comfortably.  So it
makes no sense for performance to be so terrible at this level.

So I need to start figuring out exactly what's going on.  Unfortunately I
don't know how to do that very well.  I'm looking for advice from anyone -
how to poke around and see how much memory is being consumed for what
purposes.  I know how to lookup c_min and c and c_max...  But that didn't do
me much good.  The actual value for c barely changes at all over time...
Even when I rm the file, c does not change immediately.

All the other metrics from kstat ... have less than obvious names ... so I
don't know what to look for...

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Some minor issues that might affect the above:

(1) I'm assuming you run your script repeatedly in the same pool, without deleting the pool. If that is the case, that means that a run of X+1 should dedup completely with the run of X. E.g. a run with 120000 blocks will dedup the first 110000 blocks with the prior run of 110000.

(2) can you NOT enable "verify" ? Verify *requires* a disk read before writing for any potential dedup-able block. If case #1 above applies, then by turning on dedup, you *rapidly* increase the amount of disk I/O you require on each subsequent run. E.g. the run of 100000 requires no disk I/O due to verify, but the run of 110000 requires 100000 I/O requests, while the run of 120000 requires 110000 requests, etc. This will skew your results as the ARC buffering of file info changes over time.

(3) fflush is NOT the same as fsync. If you're running the script in a loop, it's entirely possible that ZFS hasn't completely committed things to disk yet, which means that you get I/O requests to flush out the ARC write buffer in the middle of your runs. Honestly, I'd do the following for benchmarking:

        i=0
        while [i -lt 80 ];
        do
            j = $[100000 + ( 1  * 10000)]
            ./run_your_script j
            sync
            sleep 10
            i = $[$i+1]
    done



--
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to