Hello Mark, Tuesday, April 15, 2008, 8:32:32 PM, you wrote:
MM> The new write throttle code put back into build 87 attempts to MM> smooth out the process. We now measure the amount of time it takes MM> to sync each transaction group, and the amount of data in that group. MM> We dynamically resize our write throttle to try to keep the sync MM> time constant (at 5secs) under write load. We also introduce MM> "fairness" delays on writers when we near pipeline capacity: each MM> write is delayed 1/100sec when we are about to "fill up". This MM> prevents a single heavy writer from "starving out" occasional MM> writers. So instead of coming to an abrupt halt when the pipeline MM> fills, we slow down our write pace. The result should be a constant MM> even IO load. snv_91, 48x 500GB sata drives in one large stripe: # zpool create -f test c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0 c1t6d0 c1t7d0 c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t6d0 c2t7d0 c3t0d0 c3t1d0 c3t2d0 c3t3d0 c3t4d0 c3t5d0 c3t6d0 c3t7d0 c4t0d0 c4t1d0 c4t2d0 c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t0d0 c5t1d0 c5t2d0 c5t3d0 c5t4d0 c5t5d0 c5t6d0 c5t7d0 c6t0d0 c6t1d0 c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0 # zfs set atime=off test # dd if=/dev/zero of=/test/q1 bs=1024k ^C34374+0 records in 34374+0 records out # zpool iostat 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- [...] test 58.9M 21.7T 0 1.19K 0 80.8M test 862M 21.7T 0 6.67K 0 776M test 1.52G 21.7T 0 5.50K 0 689M test 1.52G 21.7T 0 9.28K 0 1.16G test 2.88G 21.7T 0 1.14K 0 135M test 2.88G 21.7T 0 1.61K 0 206M test 2.88G 21.7T 0 18.0K 0 2.24G test 5.60G 21.7T 0 79 0 264K test 5.60G 21.7T 0 0 0 0 test 5.60G 21.7T 0 10.9K 0 1.36G test 9.59G 21.7T 0 7.09K 0 897M test 9.59G 21.7T 0 0 0 0 test 9.59G 21.7T 0 6.33K 0 807M test 9.59G 21.7T 0 17.9K 0 2.24G test 13.6G 21.7T 0 1.96K 0 239M test 13.6G 21.7T 0 0 0 0 test 13.6G 21.7T 0 11.9K 0 1.49G test 17.6G 21.7T 0 9.91K 0 1.23G test 17.6G 21.7T 0 0 0 0 test 17.6G 21.7T 0 5.48K 0 700M test 17.6G 21.7T 0 20.0K 0 2.50G test 21.6G 21.7T 0 2.03K 0 244M test 21.6G 21.7T 0 0 0 0 test 21.6G 21.7T 0 0 0 0 test 21.6G 21.7T 0 4.03K 0 513M test 21.6G 21.7T 0 23.7K 0 2.97G test 25.6G 21.7T 0 1.83K 0 225M test 25.6G 21.7T 0 0 0 0 test 25.6G 21.7T 0 13.9K 0 1.74G test 29.6G 21.7T 1 1.40K 127K 167M test 29.6G 21.7T 0 0 0 0 test 29.6G 21.7T 0 7.14K 0 912M test 29.6G 21.7T 0 19.2K 0 2.40G test 33.6G 21.7T 1 378 127K 34.8M test 33.6G 21.7T 0 0 0 0 ^C Well, doesn't actually look good. Checking with iostat I don't see any problems like long service times, etc. Reducing zfs_txg_synctime to 1 helps a little bit but still it's not even stream of data. If I start 3 dd streams at the same time then it is slightly better (zfs_txg_synctime set back to 5) but still very jumpy. Reading with one dd produces steady throghput but I'm disapointed with actual performance: test 161G 21.6T 9.94K 0 1.24G 0 test 161G 21.6T 10.0K 0 1.25G 0 test 161G 21.6T 10.3K 0 1.29G 0 test 161G 21.6T 10.1K 0 1.27G 0 test 161G 21.6T 10.4K 0 1.31G 0 test 161G 21.6T 10.1K 0 1.27G 0 test 161G 21.6T 10.4K 0 1.30G 0 test 161G 21.6T 10.2K 0 1.27G 0 test 161G 21.6T 10.3K 0 1.29G 0 test 161G 21.6T 10.0K 0 1.25G 0 test 161G 21.6T 9.96K 0 1.24G 0 test 161G 21.6T 10.6K 0 1.33G 0 test 161G 21.6T 10.1K 0 1.26G 0 test 161G 21.6T 10.2K 0 1.27G 0 test 161G 21.6T 10.4K 0 1.30G 0 test 161G 21.6T 9.62K 0 1.20G 0 test 161G 21.6T 8.22K 0 1.03G 0 test 161G 21.6T 9.61K 0 1.20G 0 test 161G 21.6T 10.2K 0 1.28G 0 test 161G 21.6T 9.12K 0 1.14G 0 test 161G 21.6T 9.96K 0 1.25G 0 test 161G 21.6T 9.72K 0 1.22G 0 test 161G 21.6T 10.6K 0 1.32G 0 test 161G 21.6T 9.93K 0 1.24G 0 test 161G 21.6T 9.94K 0 1.24G 0 zpool scrub produces: test 161G 21.6T 25 69 2.70M 392K test 161G 21.6T 10.9K 0 1.35G 0 test 161G 21.6T 13.4K 0 1.66G 0 test 161G 21.6T 13.2K 0 1.63G 0 test 161G 21.6T 11.8K 0 1.46G 0 test 161G 21.6T 13.8K 0 1.72G 0 test 161G 21.6T 12.4K 0 1.53G 0 test 161G 21.6T 12.9K 0 1.59G 0 test 161G 21.6T 12.9K 0 1.59G 0 test 161G 21.6T 13.4K 0 1.67G 0 test 161G 21.6T 12.2K 0 1.51G 0 test 161G 21.6T 12.9K 0 1.59G 0 test 161G 21.6T 12.5K 0 1.55G 0 test 161G 21.6T 13.3K 0 1.64G 0 So sequential reading gives steady thruput but numbers are a little bit lower than expected. Sequential writing is still jumpy with single or multiple dd streams for pool with many disk drives. Lets destroy the pool and create a new one, smaller one. # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 # zfs set atime=off test # dd if=/dev/zero of=/test/q1 bs=1024k ^C15905+0 records in 15905+0 records out # zpool iostat 1 capacity operations bandwidth pool used avail read write read write ---------- ----- ----- ----- ----- ----- ----- [...] test 688M 2.72T 0 3.29K 0 401M test 1.01G 2.72T 0 3.69K 0 462M test 1.35G 2.72T 0 3.59K 0 450M test 1.35G 2.72T 0 2.95K 0 372M test 2.03G 2.72T 0 3.37K 0 428M test 2.03G 2.72T 0 1.94K 0 248M test 2.71G 2.72T 0 2.44K 0 301M test 2.71G 2.72T 0 3.88K 0 497M test 2.71G 2.72T 0 3.86K 0 494M test 4.07G 2.71T 0 3.42K 0 425M test 4.07G 2.71T 0 3.89K 0 498M test 4.07G 2.71T 0 3.88K 0 497M test 5.43G 2.71T 0 3.44K 0 429M test 5.43G 2.71T 0 3.94K 0 504M test 5.43G 2.71T 0 3.88K 0 497M test 5.43G 2.71T 0 3.88K 0 497M test 7.62G 2.71T 0 2.34K 0 286M test 7.62G 2.71T 0 4.23K 0 539M test 7.62G 2.71T 0 3.89K 0 498M test 7.62G 2.71T 0 3.87K 0 495M test 7.62G 2.71T 0 3.88K 0 497M test 9.81G 2.71T 0 3.33K 0 418M test 9.81G 2.71T 0 4.12K 0 526M test 9.81G 2.71T 0 3.88K 0 497M Much more steady - interesting. Let's do it again with yet bigger pool and lets keep distributing disks in "rows" across controllers. # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0 # zfs set atime=off test test 1.35G 5.44T 0 5.42K 0 671M test 2.03G 5.44T 0 7.01K 0 883M test 2.71G 5.43T 0 6.22K 0 786M test 2.71G 5.43T 0 8.09K 0 1.01G test 4.07G 5.43T 0 7.14K 0 902M test 5.43G 5.43T 0 4.02K 0 507M test 5.43G 5.43T 0 5.52K 0 700M test 5.43G 5.43T 0 8.04K 0 1.00G test 5.43G 5.43T 0 7.70K 0 986M test 8.15G 5.43T 0 6.13K 0 769M test 8.15G 5.43T 0 7.77K 0 995M test 8.15G 5.43T 0 7.67K 0 981M test 10.9G 5.43T 0 4.15K 0 517M test 10.9G 5.43T 0 7.74K 0 986M test 10.9G 5.43T 0 7.76K 0 994M test 10.9G 5.43T 0 7.75K 0 993M test 14.9G 5.42T 0 6.79K 0 860M test 14.9G 5.42T 0 7.50K 0 958M test 14.9G 5.42T 0 8.25K 0 1.03G test 14.9G 5.42T 0 7.77K 0 995M test 18.9G 5.42T 0 4.86K 0 614M starting to be more jumpy, but still not as bad as in first case. So lets create a pool out of all disks again but this time lets continue to provide disks in "rows" across controllers. # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0 c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0 c1t2d0 c2t2d0 c3t2d0 c4t2d0 c5t2d0 c6t2d0 c1t3d0 c2t3d0 c3t3d0 c4t3d0 c5t3d0 c6t3d0 c1t4d0 c2t4d0 c3t4d0 c4t4d0 c5t4d0 c6t4d0 c1t5d0 c2t5d0 c3t5d0 c4t5d0 c5t5d0 c6t5d0 c1t6d0 c2t6d0 c3t6d0 c4t6d0 c5t6d0 c6t6d0 c1t7d0 c2t7d0 c3t7d0 c4t7d0 c5t7d0 c6t7d0 # zfs set atime=off test test 862M 21.7T 0 5.81K 0 689M test 1.52G 21.7T 0 5.50K 0 689M test 2.88G 21.7T 0 10.9K 0 1.35G test 2.88G 21.7T 0 0 0 0 test 2.88G 21.7T 0 9.49K 0 1.18G test 5.60G 21.7T 0 11.1K 0 1.38G test 5.60G 21.7T 0 0 0 0 test 5.60G 21.7T 0 0 0 0 test 5.60G 21.7T 0 15.3K 0 1.90G test 9.59G 21.7T 0 15.4K 0 1.91G test 9.59G 21.7T 0 0 0 0 test 9.59G 21.7T 0 0 0 0 test 9.59G 21.7T 0 16.8K 0 2.09G test 13.6G 21.7T 0 8.60K 0 1.06G test 13.6G 21.7T 0 0 0 0 test 13.6G 21.7T 0 4.01K 0 512M test 13.6G 21.7T 0 20.2K 0 2.52G test 17.6G 21.7T 0 2.86K 0 353M test 17.6G 21.7T 0 0 0 0 test 17.6G 21.7T 0 11.6K 0 1.45G test 21.6G 21.7T 0 14.1K 0 1.75G test 21.6G 21.7T 0 0 0 0 test 21.6G 21.7T 0 0 0 0 test 21.6G 21.7T 0 4.74K 0 602M test 21.6G 21.7T 0 17.6K 0 2.20G test 25.6G 21.7T 0 8.00K 0 1008M test 25.6G 21.7T 0 0 0 0 test 25.6G 21.7T 0 0 0 0 test 25.6G 21.7T 0 16.8K 0 2.09G test 25.6G 21.7T 0 15.0K 0 1.86G test 29.6G 21.7T 0 11 0 11.9K Any idea? -- Best regards, Robert Milkowski mailto:[EMAIL PROTECTED] http://milek.blogspot.com _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss