Le 28 juin 08 à 05:14, Robert Milkowski a écrit : > Hello Mark, > > Tuesday, April 15, 2008, 8:32:32 PM, you wrote: > > MM> The new write throttle code put back into build 87 attempts to > MM> smooth out the process. We now measure the amount of time it > takes > MM> to sync each transaction group, and the amount of data in that > group. > MM> We dynamically resize our write throttle to try to keep the sync > MM> time constant (at 5secs) under write load. We also introduce > MM> "fairness" delays on writers when we near pipeline capacity: each > MM> write is delayed 1/100sec when we are about to "fill up". This > MM> prevents a single heavy writer from "starving out" occasional > MM> writers. So instead of coming to an abrupt halt when the pipeline > MM> fills, we slow down our write pace. The result should be a > constant > MM> even IO load. > > snv_91, 48x 500GB sata drives in one large stripe: > > # zpool create -f test c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 > c1t6d0 c1t7d0 c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t6d0 > c2t7d0 c3t0d0 c3t1d0 c3t2d0 c3t3d0 c3t4d0 c3t5d0 c3t6d0 c3t7d0 > c4t0d0 c4t1d0 c4t2d0 c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t0d0 > c5t1d0 c5t2d0 c5t3d0 c5t4d0 c5t5d0 c5t6d0 c5t7d0 c6t0d0 c6t1d0 > c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0 > # zfs set atime=off test > > > # dd if=/dev/zero of=/test/q1 bs=1024k > ^C34374+0 records in > 34374+0 records out > > > # zpool iostat 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > [...] > test 58.9M 21.7T 0 1.19K 0 80.8M > test 862M 21.7T 0 6.67K 0 776M > test 1.52G 21.7T 0 5.50K 0 689M > test 1.52G 21.7T 0 9.28K 0 1.16G > test 2.88G 21.7T 0 1.14K 0 135M > test 2.88G 21.7T 0 1.61K 0 206M > test 2.88G 21.7T 0 18.0K 0 2.24G > test 5.60G 21.7T 0 79 0 264K > test 5.60G 21.7T 0 0 0 0 > test 5.60G 21.7T 0 10.9K 0 1.36G > test 9.59G 21.7T 0 7.09K 0 897M > test 9.59G 21.7T 0 0 0 0 > test 9.59G 21.7T 0 6.33K 0 807M > test 9.59G 21.7T 0 17.9K 0 2.24G > test 13.6G 21.7T 0 1.96K 0 239M > test 13.6G 21.7T 0 0 0 0 > test 13.6G 21.7T 0 11.9K 0 1.49G > test 17.6G 21.7T 0 9.91K 0 1.23G > test 17.6G 21.7T 0 0 0 0 > test 17.6G 21.7T 0 5.48K 0 700M > test 17.6G 21.7T 0 20.0K 0 2.50G > test 21.6G 21.7T 0 2.03K 0 244M > test 21.6G 21.7T 0 0 0 0 > test 21.6G 21.7T 0 0 0 0 > test 21.6G 21.7T 0 4.03K 0 513M > test 21.6G 21.7T 0 23.7K 0 2.97G > test 25.6G 21.7T 0 1.83K 0 225M > test 25.6G 21.7T 0 0 0 0 > test 25.6G 21.7T 0 13.9K 0 1.74G > test 29.6G 21.7T 1 1.40K 127K 167M > test 29.6G 21.7T 0 0 0 0 > test 29.6G 21.7T 0 7.14K 0 912M > test 29.6G 21.7T 0 19.2K 0 2.40G > test 33.6G 21.7T 1 378 127K 34.8M > test 33.6G 21.7T 0 0 0 0 > ^C > > > Well, doesn't actually look good. Checking with iostat I don't see any > problems like long service times, etc. >
I suspect, a single dd is cpu bound. > Reducing zfs_txg_synctime to 1 helps a little bit but still it's not > even stream of data. > > If I start 3 dd streams at the same time then it is slightly better > (zfs_txg_synctime set back to 5) but still very jumpy. > Try zfs_txg_synctime to 10; that reduces the txg overhead. > Reading with one dd produces steady throghput but I'm disapointed with > actual performance: > Again, probably cpu bound. What's "ptime dd..." saying ? > test 161G 21.6T 9.94K 0 1.24G 0 > test 161G 21.6T 10.0K 0 1.25G 0 > test 161G 21.6T 10.3K 0 1.29G 0 > test 161G 21.6T 10.1K 0 1.27G 0 > test 161G 21.6T 10.4K 0 1.31G 0 > test 161G 21.6T 10.1K 0 1.27G 0 > test 161G 21.6T 10.4K 0 1.30G 0 > test 161G 21.6T 10.2K 0 1.27G 0 > test 161G 21.6T 10.3K 0 1.29G 0 > test 161G 21.6T 10.0K 0 1.25G 0 > test 161G 21.6T 9.96K 0 1.24G 0 > test 161G 21.6T 10.6K 0 1.33G 0 > test 161G 21.6T 10.1K 0 1.26G 0 > test 161G 21.6T 10.2K 0 1.27G 0 > test 161G 21.6T 10.4K 0 1.30G 0 > test 161G 21.6T 9.62K 0 1.20G 0 > test 161G 21.6T 8.22K 0 1.03G 0 > test 161G 21.6T 9.61K 0 1.20G 0 > test 161G 21.6T 10.2K 0 1.28G 0 > test 161G 21.6T 9.12K 0 1.14G 0 > test 161G 21.6T 9.96K 0 1.25G 0 > test 161G 21.6T 9.72K 0 1.22G 0 > test 161G 21.6T 10.6K 0 1.32G 0 > test 161G 21.6T 9.93K 0 1.24G 0 > test 161G 21.6T 9.94K 0 1.24G 0 > > > zpool scrub produces: > > test 161G 21.6T 25 69 2.70M 392K > test 161G 21.6T 10.9K 0 1.35G 0 > test 161G 21.6T 13.4K 0 1.66G 0 > test 161G 21.6T 13.2K 0 1.63G 0 > test 161G 21.6T 11.8K 0 1.46G 0 > test 161G 21.6T 13.8K 0 1.72G 0 > test 161G 21.6T 12.4K 0 1.53G 0 > test 161G 21.6T 12.9K 0 1.59G 0 > test 161G 21.6T 12.9K 0 1.59G 0 > test 161G 21.6T 13.4K 0 1.67G 0 > test 161G 21.6T 12.2K 0 1.51G 0 > test 161G 21.6T 12.9K 0 1.59G 0 > test 161G 21.6T 12.5K 0 1.55G 0 > test 161G 21.6T 13.3K 0 1.64G 0 > > > > > So sequential reading gives steady thruput but numbers are a little > bit lower than expected. > > Sequential writing is still jumpy with single or multiple dd streams > for pool with many disk drives. > > Lets destroy the pool and create a new one, smaller one. > > > > # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 > # zfs set atime=off test > > # dd if=/dev/zero of=/test/q1 bs=1024k > ^C15905+0 records in > 15905+0 records out > > > # zpool iostat 1 > capacity operations bandwidth > pool used avail read write read write > ---------- ----- ----- ----- ----- ----- ----- > [...] > test 688M 2.72T 0 3.29K 0 401M > test 1.01G 2.72T 0 3.69K 0 462M > test 1.35G 2.72T 0 3.59K 0 450M > test 1.35G 2.72T 0 2.95K 0 372M > test 2.03G 2.72T 0 3.37K 0 428M > test 2.03G 2.72T 0 1.94K 0 248M > test 2.71G 2.72T 0 2.44K 0 301M > test 2.71G 2.72T 0 3.88K 0 497M > test 2.71G 2.72T 0 3.86K 0 494M > test 4.07G 2.71T 0 3.42K 0 425M > test 4.07G 2.71T 0 3.89K 0 498M > test 4.07G 2.71T 0 3.88K 0 497M > test 5.43G 2.71T 0 3.44K 0 429M > test 5.43G 2.71T 0 3.94K 0 504M > test 5.43G 2.71T 0 3.88K 0 497M > test 5.43G 2.71T 0 3.88K 0 497M > test 7.62G 2.71T 0 2.34K 0 286M > test 7.62G 2.71T 0 4.23K 0 539M > test 7.62G 2.71T 0 3.89K 0 498M > test 7.62G 2.71T 0 3.87K 0 495M > test 7.62G 2.71T 0 3.88K 0 497M > test 9.81G 2.71T 0 3.33K 0 418M > test 9.81G 2.71T 0 4.12K 0 526M > test 9.81G 2.71T 0 3.88K 0 497M > > > Much more steady - interesting. > Now it's disk bound. > > Let's do it again with yet bigger pool and lets keep distributing > disks in "rows" across controllers. > > # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 > c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0 > # zfs set atime=off test > > test 1.35G 5.44T 0 5.42K 0 671M > test 2.03G 5.44T 0 7.01K 0 883M > test 2.71G 5.43T 0 6.22K 0 786M > test 2.71G 5.43T 0 8.09K 0 1.01G > test 4.07G 5.43T 0 7.14K 0 902M > test 5.43G 5.43T 0 4.02K 0 507M > test 5.43G 5.43T 0 5.52K 0 700M > test 5.43G 5.43T 0 8.04K 0 1.00G > test 5.43G 5.43T 0 7.70K 0 986M > test 8.15G 5.43T 0 6.13K 0 769M > test 8.15G 5.43T 0 7.77K 0 995M > test 8.15G 5.43T 0 7.67K 0 981M > test 10.9G 5.43T 0 4.15K 0 517M > test 10.9G 5.43T 0 7.74K 0 986M > test 10.9G 5.43T 0 7.76K 0 994M > test 10.9G 5.43T 0 7.75K 0 993M > test 14.9G 5.42T 0 6.79K 0 860M > test 14.9G 5.42T 0 7.50K 0 958M > test 14.9G 5.42T 0 8.25K 0 1.03G > test 14.9G 5.42T 0 7.77K 0 995M > test 18.9G 5.42T 0 4.86K 0 614M > > > starting to be more jumpy, but still not as bad as in first case. > > So lets create a pool out of all disks again but this time lets > continue to provide disks in "rows" across controllers. > > # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 > c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0 c1t2d0 c2t2d0 c3t2d0 > c4t2d0 c5t2d0 c6t2d0 c1t3d0 c2t3d0 c3t3d0 c4t3d0 c5t3d0 c6t3d0 > c1t4d0 c2t4d0 c3t4d0 c4t4d0 c5t4d0 c6t4d0 c1t5d0 c2t5d0 c3t5d0 > c4t5d0 c5t5d0 c6t5d0 c1t6d0 c2t6d0 c3t6d0 c4t6d0 c5t6d0 c6t6d0 > c1t7d0 c2t7d0 c3t7d0 c4t7d0 c5t7d0 c6t7d0 > # zfs set atime=off test > > test 862M 21.7T 0 5.81K 0 689M > test 1.52G 21.7T 0 5.50K 0 689M > test 2.88G 21.7T 0 10.9K 0 1.35G > test 2.88G 21.7T 0 0 0 0 > test 2.88G 21.7T 0 9.49K 0 1.18G > test 5.60G 21.7T 0 11.1K 0 1.38G > test 5.60G 21.7T 0 0 0 0 > test 5.60G 21.7T 0 0 0 0 > test 5.60G 21.7T 0 15.3K 0 1.90G > test 9.59G 21.7T 0 15.4K 0 1.91G > test 9.59G 21.7T 0 0 0 0 > test 9.59G 21.7T 0 0 0 0 > test 9.59G 21.7T 0 16.8K 0 2.09G > test 13.6G 21.7T 0 8.60K 0 1.06G > test 13.6G 21.7T 0 0 0 0 > test 13.6G 21.7T 0 4.01K 0 512M > test 13.6G 21.7T 0 20.2K 0 2.52G > test 17.6G 21.7T 0 2.86K 0 353M > test 17.6G 21.7T 0 0 0 0 > test 17.6G 21.7T 0 11.6K 0 1.45G > test 21.6G 21.7T 0 14.1K 0 1.75G > test 21.6G 21.7T 0 0 0 0 > test 21.6G 21.7T 0 0 0 0 > test 21.6G 21.7T 0 4.74K 0 602M > test 21.6G 21.7T 0 17.6K 0 2.20G > test 25.6G 21.7T 0 8.00K 0 1008M > test 25.6G 21.7T 0 0 0 0 > test 25.6G 21.7T 0 0 0 0 > test 25.6G 21.7T 0 16.8K 0 2.09G > test 25.6G 21.7T 0 15.0K 0 1.86G > test 29.6G 21.7T 0 11 0 11.9K > > > > Any idea? > > > > -- > Best regards, > Robert Milkowski mailto:[EMAIL PROTECTED] > http://milek.blogspot.com > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss