Sorry, turned on html mode to avoid gmail's line wrapping. On Mon, May 31, 2010 at 4:58 PM, Sandon Van Ness <san...@van-ness.com>wrote:
> On 05/31/2010 02:52 PM, Mike Gerdts wrote: > > On Mon, May 31, 2010 at 4:32 PM, Sandon Van Ness <san...@van-ness.com> > wrote: > > > >> On 05/31/2010 01:51 PM, Bob Friesenhahn wrote: > >> > >>> There are multiple factors at work. Your OpenSolaris should be new > >>> enough to have the fix in which the zfs I/O tasks are run in in a > >>> scheduling class at lower priority than normal user processes. > >>> However, there is also a throttling mechanism for processes which > >>> produce data faster than can be consumed by the disks. This > >>> throttling mechanism depends on the amount of RAM available to zfs and > >>> the write speed of the I/O channel. More available RAM results in > >>> more write buffering, which results in a larger chunk of data written > >>> at the next transaction group write interval. The maximum size of a > >>> transaction group may be configured in /etc/system similar to: > >>> > >>> * Set ZFS maximum TXG group size to 2684354560 > >>> set zfs:zfs_write_limit_override = 0xa0000000 > >>> > >>> If the transaction group is smaller, then zfs will need to write more > >>> often. Processes will still be throttled but the duration of the > >>> delay should be smaller due to less data to write in each burst. I > >>> think that (with multiple writers) the zfs pool will be "healthier" > >>> and less fragmented if you can offer zfs more RAM and accept some > >>> stalls during writing. There are always tradeoffs. > >>> > >>> Bob > >>> > >> well it seems like when messing with the txg sync times and stuff like > >> that it did make the transfer more smooth but didn't actually help with > >> speeds as it just meant the hangs happened for a shorter time but at a > >> smaller interval and actually lowering the time between writes just > >> seemed to make things worse (slightly). > >> > >> I think I have came to the conclusion that the problem here is CPU due > >> to the fact that its only doing this with parity raid. I would think if > >> it was I/O based then it would be the same as if anything its heavier on > >> I/O on non parity raid due to the fact that it is no longer CPU > >> bottlenecked (dd write test gives me near 700 megabytes/sec vs 450 with > >> parity raidz2). > >> > > To see if the CPU is pegged, take a look at the output of: > > > > mpstat 1 > > prstat -mLc 1 > > > > If mpstat shows that the idle time reaches 0 or the process' latency > > column is more then a few tenths of a percent, you are probably short > > on CPU. > > > > It could also be that interrupts are stealing cycles from rsync. > > Placing it in a processor set with interrupts disabled in that > > processor set may help. > > > > > > Unfortunately none of these utilies make it possible to ge values for <1 > second which is what the hang is (its happening for about 1/2 of a second). > > Here is with mpstat: > <snip - bad line wrapped lines removed> > > Here is what i get with prstat: > <snip - removed first interval & fixed formatting of next> > Total: 57 processes, 260 lwps, load averages: 2.15, 2.16, 2.15 > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > 604 root 0.0 33 0.0 0.0 0.0 0.0 42 25 18 13 0 0 > zpool-data/13 > 604 root 0.0 30 0.0 0.0 0.0 0.0 41 29 12 12 0 0 > zpool-data/15 > 1326 root 12 2.9 0.0 0.0 0.0 0.0 85 0.4 1K 12 11K 0 rsync/1 > 604 root 0.0 15 0.0 0.0 0.0 0.0 41 44 111 9 0 0 > zpool-data/27 > 604 root 0.0 14 0.0 0.0 0.0 0.0 43 42 72 3 0 0 > zpool-data/33 > 604 root 0.0 5.9 0.0 0.0 0.0 0.0 41 53 109 6 0 0 > zpool-data/19 > 604 root 0.0 5.4 0.0 0.0 0.0 0.0 42 53 106 8 0 0 > zpool-data/25 > 604 root 0.0 5.3 0.0 0.0 0.0 0.0 43 51 107 7 0 0 > zpool-data/21 > 604 root 0.0 4.5 0.0 0.0 0.0 0.0 41 54 110 4 0 0 > zpool-data/31 > 604 root 0.0 3.9 0.0 0.0 0.0 0.0 41 55 109 3 0 0 > zpool-data/23 > 604 root 0.0 3.7 0.0 0.0 0.0 0.0 44 52 111 2 0 0 > zpool-data/29 > 1322 root 0.0 0.4 0.0 0.0 0.0 0.0 98 2.0 1K 0 1 0 rsync/1 > 22644 root 0.0 0.2 0.0 0.0 0.0 0.0 100 0.0 16 13 255 0 prstat/1 > 14409 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 5 3 69 0 sshd/1 > 196 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 15 2 105 0 nscd/17 > In the interval above, it looks as though rsync is getting spending a very small amount of time waiting for a CPU (LAT or latency), but various zfs threads spent quite a bit of time (relative to the interval) waiting for a CPU. Compared to the next interval, rsync spent very little time on cpu (usr + sys): 14.9% vs. 55%, 58%, etc.). Perhaps it is not being fed data quickly enough because of CPU contention that prevents timely transferring of data from the NIC to rsync. Total: 57 processes, 260 lwps, load averages: 2.15, 2.16, 2.15 > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > 1326 root 44 11 0.0 0.0 0.0 0.0 44 1.3 5K 24 42K 0 rsync/1 > 1322 root 0.0 1.4 0.0 0.0 0.0 0.0 93 5.9 5K 0 0 0 rsync/1 > 22644 root 0.0 0.2 0.0 0.0 0.0 0.0 100 0.0 16 0 255 0 prstat/1 > 604 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 1 0 0 0 zpool-data/2 > 14409 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 14 1 133 0 sshd/1 > 196 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 15 0 105 0 nscd/17 > 604 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 62 0 0 0 zpool-data/3 > 604 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 28 0 0 0 > zpool-data/39 > 196 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 1 1 5 0 nscd/7 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/15 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/14 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/13 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/12 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/11 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/10 > Total: 57 processes, 260 lwps, load averages: 2.15, 2.16, 2.15 > PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG > PROCESS/LWPID > 1326 root 47 11 0.0 0.0 0.0 0.0 40 1.5 6K 34 45K 0 rsync/1 > 1322 root 0.0 1.4 0.0 0.0 0.0 0.0 93 5.8 6K 0 0 0 rsync/1 > 22644 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.0 16 1 255 0 prstat/1 > 196 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 15 0 105 0 nscd/17 > 14409 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 2 3 31 0 sshd/1 > 178 root 0.0 0.0 0.0 0.0 0.0 100 0.0 0.0 1 0 2 0 devfsadm/3 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/15 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/14 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/13 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/12 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/11 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/10 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/9 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/8 > 5 root 0.0 0.0 0.0 0.0 0.0 0.0 100 0.0 0 0 0 0 > zpool-rpool/7 > Total: 57 processes, 260 lwps, load averages: 2.15, 2.16, 2.15 > > -- Mike Gerdts http://mgerdts.blogspot.com/
_______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss