Re: [zfs-discuss] Periodic flush

Roch Bourbonnais Sat, 28 Jun 2008 03:26:17 -0700

Le 28 juin 08 à 05:14, Robert Milkowski a écrit :

> Hello Mark,
>
> Tuesday, April 15, 2008, 8:32:32 PM, you wrote:
>
> MM> The new write throttle code put back into build 87 attempts to
> MM> smooth out the process.  We now measure the amount of time it  
> takes
> MM> to sync each transaction group, and the amount of data in that  
> group.
> MM> We dynamically resize our write throttle to try to keep the sync
> MM> time constant (at 5secs) under write load.  We also introduce
> MM> "fairness" delays on writers when we near pipeline capacity: each
> MM> write is delayed 1/100sec when we are about to "fill up".  This
> MM> prevents a single heavy writer from "starving out" occasional
> MM> writers.  So instead of coming to an abrupt halt when the pipeline
> MM> fills, we slow down our write pace.  The result should be a  
> constant
> MM> even IO load.
>
> snv_91, 48x 500GB sata drives in one large stripe:
>
> # zpool create -f test c1t0d0 c1t1d0 c1t2d0 c1t3d0 c1t4d0 c1t5d0  
> c1t6d0 c1t7d0 c2t0d0 c2t1d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0 c2t6d0  
> c2t7d0 c3t0d0 c3t1d0 c3t2d0 c3t3d0 c3t4d0 c3t5d0 c3t6d0 c3t7d0  
> c4t0d0 c4t1d0 c4t2d0 c4t3d0 c4t4d0 c4t5d0 c4t6d0 c4t7d0 c5t0d0  
> c5t1d0 c5t2d0 c5t3d0 c5t4d0 c5t5d0 c5t6d0 c5t7d0 c6t0d0 c6t1d0  
> c6t2d0 c6t3d0 c6t4d0 c6t5d0 c6t6d0 c6t7d0
> # zfs set atime=off test
>
>
> # dd if=/dev/zero of=/test/q1 bs=1024k
> ^C34374+0 records in
> 34374+0 records out
>
>
> # zpool iostat 1
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> [...]
> test        58.9M  21.7T      0  1.19K      0  80.8M
> test         862M  21.7T      0  6.67K      0   776M
> test        1.52G  21.7T      0  5.50K      0   689M
> test        1.52G  21.7T      0  9.28K      0  1.16G
> test        2.88G  21.7T      0  1.14K      0   135M
> test        2.88G  21.7T      0  1.61K      0   206M
> test        2.88G  21.7T      0  18.0K      0  2.24G
> test        5.60G  21.7T      0     79      0   264K
> test        5.60G  21.7T      0      0      0      0
> test        5.60G  21.7T      0  10.9K      0  1.36G
> test        9.59G  21.7T      0  7.09K      0   897M
> test        9.59G  21.7T      0      0      0      0
> test        9.59G  21.7T      0  6.33K      0   807M
> test        9.59G  21.7T      0  17.9K      0  2.24G
> test        13.6G  21.7T      0  1.96K      0   239M
> test        13.6G  21.7T      0      0      0      0
> test        13.6G  21.7T      0  11.9K      0  1.49G
> test        17.6G  21.7T      0  9.91K      0  1.23G
> test        17.6G  21.7T      0      0      0      0
> test        17.6G  21.7T      0  5.48K      0   700M
> test        17.6G  21.7T      0  20.0K      0  2.50G
> test        21.6G  21.7T      0  2.03K      0   244M
> test        21.6G  21.7T      0      0      0      0
> test        21.6G  21.7T      0      0      0      0
> test        21.6G  21.7T      0  4.03K      0   513M
> test        21.6G  21.7T      0  23.7K      0  2.97G
> test        25.6G  21.7T      0  1.83K      0   225M
> test        25.6G  21.7T      0      0      0      0
> test        25.6G  21.7T      0  13.9K      0  1.74G
> test        29.6G  21.7T      1  1.40K   127K   167M
> test        29.6G  21.7T      0      0      0      0
> test        29.6G  21.7T      0  7.14K      0   912M
> test        29.6G  21.7T      0  19.2K      0  2.40G
> test        33.6G  21.7T      1    378   127K  34.8M
> test        33.6G  21.7T      0      0      0      0
> ^C
>
>
> Well, doesn't actually look good. Checking with iostat I don't see any
> problems like long service times, etc.
>


I suspect,  a single dd is cpu bound.

> Reducing zfs_txg_synctime to 1 helps a little bit but still it's not
> even stream of data.
>
> If I start 3 dd streams at the same time then it is slightly better
> (zfs_txg_synctime set back to 5) but still very jumpy.
>

Try zfs_txg_synctime to 10; that reduces the txg overhead.

> Reading with one dd produces steady throghput but I'm disapointed with
> actual performance:
>

Again, probably cpu bound. What's "ptime dd..." saying ?

> test         161G  21.6T  9.94K      0  1.24G      0
> test         161G  21.6T  10.0K      0  1.25G      0
> test         161G  21.6T  10.3K      0  1.29G      0
> test         161G  21.6T  10.1K      0  1.27G      0
> test         161G  21.6T  10.4K      0  1.31G      0
> test         161G  21.6T  10.1K      0  1.27G      0
> test         161G  21.6T  10.4K      0  1.30G      0
> test         161G  21.6T  10.2K      0  1.27G      0
> test         161G  21.6T  10.3K      0  1.29G      0
> test         161G  21.6T  10.0K      0  1.25G      0
> test         161G  21.6T  9.96K      0  1.24G      0
> test         161G  21.6T  10.6K      0  1.33G      0
> test         161G  21.6T  10.1K      0  1.26G      0
> test         161G  21.6T  10.2K      0  1.27G      0
> test         161G  21.6T  10.4K      0  1.30G      0
> test         161G  21.6T  9.62K      0  1.20G      0
> test         161G  21.6T  8.22K      0  1.03G      0
> test         161G  21.6T  9.61K      0  1.20G      0
> test         161G  21.6T  10.2K      0  1.28G      0
> test         161G  21.6T  9.12K      0  1.14G      0
> test         161G  21.6T  9.96K      0  1.25G      0
> test         161G  21.6T  9.72K      0  1.22G      0
> test         161G  21.6T  10.6K      0  1.32G      0
> test         161G  21.6T  9.93K      0  1.24G      0
> test         161G  21.6T  9.94K      0  1.24G      0
>
>
> zpool scrub produces:
>
> test         161G  21.6T     25     69  2.70M   392K
> test         161G  21.6T  10.9K      0  1.35G      0
> test         161G  21.6T  13.4K      0  1.66G      0
> test         161G  21.6T  13.2K      0  1.63G      0
> test         161G  21.6T  11.8K      0  1.46G      0
> test         161G  21.6T  13.8K      0  1.72G      0
> test         161G  21.6T  12.4K      0  1.53G      0
> test         161G  21.6T  12.9K      0  1.59G      0
> test         161G  21.6T  12.9K      0  1.59G      0
> test         161G  21.6T  13.4K      0  1.67G      0
> test         161G  21.6T  12.2K      0  1.51G      0
> test         161G  21.6T  12.9K      0  1.59G      0
> test         161G  21.6T  12.5K      0  1.55G      0
> test         161G  21.6T  13.3K      0  1.64G      0
>
>
>
>
> So sequential reading gives steady thruput but numbers are a little
> bit lower than expected.
>
> Sequential writing is still jumpy with single or multiple dd streams
> for pool with many disk drives.
>
> Lets destroy the pool and create a new one, smaller one.
>
>
>
> # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0
> # zfs set atime=off test
>
> # dd if=/dev/zero of=/test/q1 bs=1024k
> ^C15905+0 records in
> 15905+0 records out
>
>
> # zpool iostat 1
>               capacity     operations    bandwidth
> pool         used  avail   read  write   read  write
> ----------  -----  -----  -----  -----  -----  -----
> [...]
> test         688M  2.72T      0  3.29K      0   401M
> test        1.01G  2.72T      0  3.69K      0   462M
> test        1.35G  2.72T      0  3.59K      0   450M
> test        1.35G  2.72T      0  2.95K      0   372M
> test        2.03G  2.72T      0  3.37K      0   428M
> test        2.03G  2.72T      0  1.94K      0   248M
> test        2.71G  2.72T      0  2.44K      0   301M
> test        2.71G  2.72T      0  3.88K      0   497M
> test        2.71G  2.72T      0  3.86K      0   494M
> test        4.07G  2.71T      0  3.42K      0   425M
> test        4.07G  2.71T      0  3.89K      0   498M
> test        4.07G  2.71T      0  3.88K      0   497M
> test        5.43G  2.71T      0  3.44K      0   429M
> test        5.43G  2.71T      0  3.94K      0   504M
> test        5.43G  2.71T      0  3.88K      0   497M
> test        5.43G  2.71T      0  3.88K      0   497M
> test        7.62G  2.71T      0  2.34K      0   286M
> test        7.62G  2.71T      0  4.23K      0   539M
> test        7.62G  2.71T      0  3.89K      0   498M
> test        7.62G  2.71T      0  3.87K      0   495M
> test        7.62G  2.71T      0  3.88K      0   497M
> test        9.81G  2.71T      0  3.33K      0   418M
> test        9.81G  2.71T      0  4.12K      0   526M
> test        9.81G  2.71T      0  3.88K      0   497M
>
>
> Much more steady - interesting.
>

Now it's disk bound.

>
> Let's do it again with yet bigger pool and lets keep distributing
> disks in "rows" across controllers.
>
> # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0  
> c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0
> # zfs set atime=off test
>
> test        1.35G  5.44T      0  5.42K      0   671M
> test        2.03G  5.44T      0  7.01K      0   883M
> test        2.71G  5.43T      0  6.22K      0   786M
> test        2.71G  5.43T      0  8.09K      0  1.01G
> test        4.07G  5.43T      0  7.14K      0   902M
> test        5.43G  5.43T      0  4.02K      0   507M
> test        5.43G  5.43T      0  5.52K      0   700M
> test        5.43G  5.43T      0  8.04K      0  1.00G
> test        5.43G  5.43T      0  7.70K      0   986M
> test        8.15G  5.43T      0  6.13K      0   769M
> test        8.15G  5.43T      0  7.77K      0   995M
> test        8.15G  5.43T      0  7.67K      0   981M
> test        10.9G  5.43T      0  4.15K      0   517M
> test        10.9G  5.43T      0  7.74K      0   986M
> test        10.9G  5.43T      0  7.76K      0   994M
> test        10.9G  5.43T      0  7.75K      0   993M
> test        14.9G  5.42T      0  6.79K      0   860M
> test        14.9G  5.42T      0  7.50K      0   958M
> test        14.9G  5.42T      0  8.25K      0  1.03G
> test        14.9G  5.42T      0  7.77K      0   995M
> test        18.9G  5.42T      0  4.86K      0   614M
>
>
> starting to be more jumpy, but still not as bad as in first case.
>
> So lets create a pool out of all disks again but this time lets
> continue to provide disks in "rows" across controllers.
>
> # zpool create -f test c1t0d0 c2t0d0 c3t0d0 c4t0d0 c5t0d0 c6t0d0  
> c1t1d0 c2t1d0 c3t1d0 c4t1d0 c5t1d0 c6t1d0 c1t2d0 c2t2d0 c3t2d0  
> c4t2d0 c5t2d0 c6t2d0 c1t3d0 c2t3d0 c3t3d0 c4t3d0 c5t3d0 c6t3d0  
> c1t4d0 c2t4d0 c3t4d0 c4t4d0 c5t4d0 c6t4d0 c1t5d0 c2t5d0 c3t5d0  
> c4t5d0 c5t5d0 c6t5d0 c1t6d0 c2t6d0 c3t6d0 c4t6d0 c5t6d0 c6t6d0  
> c1t7d0 c2t7d0 c3t7d0 c4t7d0 c5t7d0 c6t7d0
> # zfs set atime=off test
>
> test         862M  21.7T      0  5.81K      0   689M
> test        1.52G  21.7T      0  5.50K      0   689M
> test        2.88G  21.7T      0  10.9K      0  1.35G
> test        2.88G  21.7T      0      0      0      0
> test        2.88G  21.7T      0  9.49K      0  1.18G
> test        5.60G  21.7T      0  11.1K      0  1.38G
> test        5.60G  21.7T      0      0      0      0
> test        5.60G  21.7T      0      0      0      0
> test        5.60G  21.7T      0  15.3K      0  1.90G
> test        9.59G  21.7T      0  15.4K      0  1.91G
> test        9.59G  21.7T      0      0      0      0
> test        9.59G  21.7T      0      0      0      0
> test        9.59G  21.7T      0  16.8K      0  2.09G
> test        13.6G  21.7T      0  8.60K      0  1.06G
> test        13.6G  21.7T      0      0      0      0
> test        13.6G  21.7T      0  4.01K      0   512M
> test        13.6G  21.7T      0  20.2K      0  2.52G
> test        17.6G  21.7T      0  2.86K      0   353M
> test        17.6G  21.7T      0      0      0      0
> test        17.6G  21.7T      0  11.6K      0  1.45G
> test        21.6G  21.7T      0  14.1K      0  1.75G
> test        21.6G  21.7T      0      0      0      0
> test        21.6G  21.7T      0      0      0      0
> test        21.6G  21.7T      0  4.74K      0   602M
> test        21.6G  21.7T      0  17.6K      0  2.20G
> test        25.6G  21.7T      0  8.00K      0  1008M
> test        25.6G  21.7T      0      0      0      0
> test        25.6G  21.7T      0      0      0      0
> test        25.6G  21.7T      0  16.8K      0  2.09G
> test        25.6G  21.7T      0  15.0K      0  1.86G
> test        29.6G  21.7T      0     11      0  11.9K
>
>
>
> Any idea?
>
>
>
> -- 
> Best regards,
> Robert Milkowski                           mailto:[EMAIL PROTECTED]
>                                       http://milek.blogspot.com
>
> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Periodic flush

Reply via email to