> From: Erik Trimble [mailto:erik.trim...@oracle.com] > > (1) I'm assuming you run your script repeatedly in the same pool, > without deleting the pool. If that is the case, that means that a run of > X+1 should dedup completely with the run of X. E.g. a run with 120000 > blocks will dedup the first 110000 blocks with the prior run of 110000.
I rm the file in between each run. So if I'm not mistaken, no dedup happens on consecutive runs based on previous runs. > (2) can you NOT enable "verify" ? Verify *requires* a disk read before > writing for any potential dedup-able block. Every block is unique. There is never anything to verify because there is never a checksum match. Why would I test dedup on non-dedupable data? You can see it's a test. In any pool where you want to enable dedup, you're going to have a number of dedupable blocks, and a number of non-dedupable blocks. The memory requirement is based on number of allocated blocks in the pool. So I want to establish an upper and lower bound for dedup performance. I am running some tests on entirely duplicate data to see how fast it goes, and also running the described test on entirely non-duplicate data... With enough ram and without enough ram... As verification that we know how to predict the lower bound. So far, I'm failing to predict the lower bound, which is why I've come here to talk about it. I've done a bunch of tests with dedup=verify or dedup=sha256. Results the same. But I didn't do that for this particular test. I'll run with just sha256 if you would still like me to after what I just said. > (3) fflush is NOT the same as fsync. If you're running the script in a > loop, it's entirely possible that ZFS hasn't completely committed things > to disk yet, Oh. Well I'll change that - but - I actually sat here and watched the HDD light, so even though I did that wrong, I can say the hard drive finished and became idle in between each run. (I stuck sleep statements in between each run specifically so I could watch the HDD light.) > i=0 > while [i -lt 80 ]; > do > j = $[100000 + ( 1 * 10000)] > ./run_your_script j > sync > sleep 10 > i = $[$i+1] > done Oh, yeah. That's what I did, minus the sync command. I'll make sure to include that next time. And I used "time ~/datagenerator" Incidentally, does fsync() and sync return instantly or wait? Cuz "time sync" might product 0 sec every time even if there were something waiting to be flushed to disk. _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss