Re: [perf-discuss] poor application performance - very high cross-calls

Phil Harman Mon, 29 Jun 2009 02:13:00 -0700

Matthew,

In addition to Roland's comment about patching (which is particularlyimportant due to the rate of change in ZFS) ...

1) Your dd test doesn't really tell us much. Do you know what sizewrites fwd is doing? I doubt it is 128k. You could use truss/dtrace tofind out and then use dd again with the same blocksize.

2) You've said the application is single threaded. This isn't a good fitfor the T-series (which was designed for throughput workloads at theexpense of single thread performance ... e.g. the M-series has typically4x the single thread performance the T-series).

3) It would be helpful to see 'prstat -Lm' to get a better overall feelfor what is running where (especially, where fwd is spending its time).

4) It's cool to see you using DTrace. You've shown again that one-linerscan tell us so much, and so easily. My only comment is that it would beuseful to know how long your tests ran before you hit ^C.

5) How about some DTrace to find out where in the kernel the crosscallsare happening? For example ...


  dtrace -n 'sysinfo:::xcalls { @[stack()] = count();  }

... although bringing the system up-to-date with patches may resolvethis issue anyway.


Hope this helps,
Phil


Matthew Flanagan wrote:

Hi,

I'm trying to track down the source of a performance problem with an 
application. The application in question is from a firewall vendor and the 
component of it that is suffering is the log handling daemon. The log daemon is 
single threaded and receives logs from firewalls and also services various 
consumers which could be either simple log viewing CLI or GUI clients or log 
reporting engines running on other servers.

The performance issue is seen from both the consumers and producers. Producers 
complain that the log server is not responding and therefore unable to send 
logs. The consumers don't complain (but the users do!) but are very slow when 
reading the logs, filtering on certain fields etc.

I'm at a bit of a loss on where to go from here. Below is the analysis I've 
done so far. I'm looking for advice on whether this can be solved by throwing 
more resources at it or whether it is being limited by a poorly written 
application. Please let me know if you'd like to see more output from dtrace or 
other commands.

Note, one thing I have seen is that if I stop the reporting engine running on a 
different server then the number of cross-calls drops significantly.

The log server specs are:

4 Core T5120 with 16GB RAM
2TB SAN storage (mirrored ZFS pool).
$ uname -a
SunOS logserver 5.10 Generic_137111-01 sun4v sparc SUNW,SPARC-Enterprise-T5120

Here is what I've looked at so far:

1. vmstat - seems ok to me.
# vmstat 3
 kthr      memory            page            disk          faults      cpu
 r b w   swap  free  re  mf pi po fr de sr m0 m1 m2 m1   in   sy   cs us sy id
 0 0 0 15422904 700912 133 15 6 0  0  0  0  0  0  0  1 1618 15569 1774 3  3 94
 0 0 0 15457416 730808 54 26 13 0  0  0  0  0  0  0  0 1679 18773 1913 3  3 94
 0 0 0 15455944 729320 148 8 0  0  0  0  0  0  0  0  0 1073 19944 1000 4  3 93
 0 0 0 15454744 728632 223 3 0  0  0  0  0  0  0  0  0 1864 22512 2104 4  3 93
 0 0 0 15454232 729328 160 4 0  0  0  0  0  0  0  0  0  994 18935 997  3  3 94
 0 0 0 15453400 730080 163 4 0  0  0  0  0  0  0  0  0 1777 18536 2088 3  3 94
 0 0 0 15452920 730040 185 3 0  0  0  0  0  0  0  0  0 1950 21576 2198 4  3 93
 0 0 0 15452416 730456 146 0 0  0  0  0  0  0  0  0  0 1033 20987 958  3  3 94

2. iostat - just looking at the fibre channel HBA throughput. Not really 
pushing the SAN to its limits.
# iostat -xznC 1 | egrep c.$
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.5    1.0    5.7   14.1  0.0  0.0    0.0   15.5   0   1 c1
    1.8   87.9  187.7 3028.6  0.0  1.5    0.2   16.5   0  26 c2
    1.9   88.1  194.4 3066.4  0.0  1.5    0.2   16.5   0  26 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0  521.4    0.0 15009.5  0.0  1.5    0.0    3.0   0  78 c2
    0.0  518.4    0.0 15647.2  0.0  1.5    0.0    2.9   0  78 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    5.0    3.0    0.4    0.0  0.0  0.0    0.0    0.0   0   0 c2
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    4.0    0.0    2.0  0.0  0.0    0.0    4.9   0   2 c1
    0.0  387.4    0.0 12853.2  0.0  1.0    0.0    2.6   0  59 c2
    0.0  414.5    0.0 13446.9  0.0  1.0    0.0    2.5   0  57 c3

Running a `dd` of a 2GB log file shows how fast we can push the SAN
which peaks at 222 MBytes/s!

# time dd if=2009-06-28_224348_21.log of=/dev/null bs=128k
16368+1 records in
16368+1 records out

real    0m9.298s
user    0m0.094s
sys     0m4.138s

while dd is happening:
# iostat -xznC 1 | egrep c.$
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.5    1.0    5.7   14.0  0.0  0.0    0.0   15.5   0   1 c1
    1.9   87.9  189.6 3028.4  0.0  1.5    0.2   16.5   0  26 c2
    1.9   88.1  196.2 3066.3  0.0  1.5    0.2   16.4   0  26 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    2.1    0.0  263.5    0.0  0.0  0.0    0.0    8.4   0   2 c2
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    5.1    0.0  651.9    0.0  0.0  0.1    0.0   18.1   0   9 c2
    5.1    0.0  651.9    0.0  0.0  0.3    0.0   64.2   0  33 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    5.0    3.0    0.4    0.0  0.0  0.0    0.0    0.0   0   0 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0    0.0    4.9    0.0  0.0  0.0    0.0    0.2   0   0 c1
  386.8    0.0 49513.3    0.0  0.0 20.0    0.0   51.8   0 678 c2
  392.7    0.0 50269.1    0.0  0.0 20.0    0.0   51.0   0 667 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  869.9    3.0 110697.5    0.0  0.0 49.6    0.0   56.8   0 1389 c2
  843.6    0.0 107982.0    0.0  0.0 49.6    0.0   58.8   0 1356 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  737.5  380.6 94397.2 16839.9  0.0 53.3    0.0   47.7   0 1347 c2
  742.4  362.8 95025.5 15487.8  0.0 53.7    0.0   48.6   0 1366 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  884.6    0.0 113231.1    0.0  0.0 53.5    0.0   60.5   0 1547 c2
  894.7    0.0 114516.7    0.0  0.0 53.7    0.0   60.0   0 1528 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  824.0    0.0 105467.2    0.0  0.0 50.3    0.0   61.1   0 1425 c2
  831.9    0.0 106478.8    0.0  0.0 50.1    0.0   60.3   0 1439 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  837.2    0.0 107162.6    0.0  0.0 51.8    0.0   61.9   0 1472 c2
  838.3    0.0 107299.9    0.0  0.0 51.5    0.0   61.4   0 1468 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  844.7    0.0 108124.9    0.0  0.0 51.9    0.0   61.4   0 1468 c2
  842.7    0.0 107865.9    0.0  0.0 51.3    0.0   60.9   0 1464 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
  866.4  154.9 110896.3 12598.9  0.0 60.8    0.0   59.6   0 1551 c2
  857.4  390.7 109745.6 17811.3  0.0 59.2    0.0   47.5   0 1544 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   51.1    0.0   39.6  0.0  0.9    0.0   18.3   0  30 c1
  830.5    0.0 106298.7    0.0  0.0 55.7    0.0   67.1   0 1511 c2
  842.5    0.0 107839.9    0.0  0.0 54.9    0.0   65.2   0 1530 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0   13.0    0.0    8.5  0.0  0.1    0.0    9.5   0   8 c1
  355.1    0.0 45454.7    0.0  0.0 13.3    0.0   37.4   0 565 c2
  318.0    0.0 40705.3    0.0  0.0 13.3    0.0   41.8   0 572 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    1.0    0.0  128.3    0.0  0.0  0.0    0.0    6.2   0   1 c2
    5.0    3.0    0.4    0.0  0.0  0.0    0.0    0.0   0   0 c3
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    2.0    0.0    1.0  0.0  0.0    0.0    1.6   0   0 c1
    0.0  482.1    0.0 14731.3  0.0  1.2    0.0    2.4   0  64 c2
    0.0  428.1    0.0 15401.4  0.0  1.1    0.0    2.6   0  61 c3

I think I can safely eliminate the SAN as the bottleneck.

3. mpstat - the column that stands out to me is xcal. I've seen it peak up to 
1.5 million xcalls on a single CPU!

# mpstat 2

[snip]

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0 96318   337  150  194    3   50   40    7   640    7  12   0  81
  1  801   0  338    87    0  144    4   33  171    0   889   61  31   0   8
  2    0   0    7    36    0   45    0   16   28    1   543    3   3   0  94
  3    0   0    1    15    1   12    0    6    3    0     0    0   0   0 100
  4    0   0 50458   305  253  123    9   23   29    4  3513   27  13   0  60
  5    0   0  223   215  176   89    2   18   70    0   109    7   1   0  91
  6    0   0 10920     9    0   17    0    1    1    1    38    0   1   0  99
  7    0   0  195   168  163   10    0    0   38    0     0    0   1   0  99
 16    1   0  172   222    0  465    9   80   84    9  2324   22   4   0  73
 17    0   0   24    83    1  157    5   16   35    0   870   15   1   0  84
 18    0   0   23    52    0  108    0    3   28    0     0    0   0   0 100
 19    0   0    1    35    0   69    0    0    0    0     0    0   0   0 100
 20    9   0   31     6    0    0    5    0    1    0    17  100   0   0   0
 21    0   0   10    26    0   49    0   10   27    0    60    0   0   0 100
 22    0   0    0     2    1    0    0    0    0    0     0    0   0   0 100
 23    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 24   83   0 592427    26    0   42    7   15   58    1  1922   32  46   0  23
 25    1   0  212    50    0   98    1   23   21    3   949    5   2   0  93
 26    0   0    1     8    0   16    0    7    8    0     2    0   0   0 100
 27    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 28    0   0  117    53    0  105    5   17   24    7  1923   41   6   0  53
 29    0   0    3    24    0   47    0   10    9    1   497    2   1   0  97
 30    0   0    0     1    0    1    0    0    0    0     0    0   0   0 100
 31    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 32    1   0 2254   119    0  251    5   57  102    6  2284   33   8   0  59
 33    4   0  334    79    0  163    7   20   37    6  2007   13   6   0  81
 34    0   0    3     5    0    9    0    2    2    0     0    0   0   0 100
 35    0   0    3     9    0   17    0    0    0    0     0    0   0   0 100
 36    0   0 259875   110    0  239   11   33  102    9  2727   10  33   0  57
 37    0   0    4    13    0   24    0    5    3    0   282    5   0   0  95
 38    0   0    3    37    0   76    0    0    0    0     0    0   0   0 100
 39    0   0    0     1    0    1    0    0    0    0     0    0   0   0 100
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    7   0 1593   256  154   28    6    7   42    0  9074   48  25   0  27
  1    0   0    2    13    0   23    0    7    0    0    30    0   1   0  99
  2    0   0    0     3    0    4    0    0    0    0    80    0   0   0 100
  3    0   0    0     2    1    0    0    0    0    0     0    0   0   0 100
  4    0   0  523   429  411  124    4    6   24    0  1703    7   5   0  87
  5    0   0    0    52    0  102    0    0    0    0    72    0   0   0 100
  6    0   0    0     5    0    8    0    0    0    0     0    0   0   0 100
  7    0   0    1    19    0   36    0    0    0    0     0    0   0   0 100
 16    0   0 125980    44    0   79    7   13   12    0  3319   22  18   0  59
 17    0   0    3    15    0   27    0    1    2    0   293    3   1   0  96
 18    0   0    1     5    0    9    0    0    0    0     0    0   0   0 100
 19    0   0    2    19    0   35    0    0    0    0     0    0   0   0 100
 20    0   0    4    23    0   46    0    6    6    0  1254    6   1   0  93
 21    0   0    0     2    0    2    0    0    0    0     0    0   0   0 100
 22    0   0    0     2    1    0    0    0    0    0     0    0   0   0 100
 23    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 24    3   0   94    49    0   89    4   12   20    0  3314   14   7   0  78
 25    0   0    0     6    0   10    0    2   10    0    38    0   0   0 100
 26    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 27    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 28    0   0  311    32    0   58    3    7    5    0  1156    8   5   0  86
 29    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 30    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 31    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 32    1   0  407    53    0  104    4   16   27    1  3042   15   7   0  78
 33    0   0   36    86    0  170    0    2    0    0     2    0   0   0 100
 34    0   0    0     3    0    5    0    0    0    0     0    0   0   0 100
 35    0   0    3     7    0   12    0    0    0    0     0    0   0   0 100
 36    0   0   12    49    0   96    1    8    2    0   878    4   1   0  95
 37    1   0   32     3    0    4    0    1    0    0    12    0   0   0 100
 38    0   0    3    27    0   51    0    0    0    0     0    0   0   0 100
 39    0   0    0     4    0    6    0    0    0    0     0    0   0   0 100
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0  204   258  146   92    3   16    3    0  2692    7   6   0  87
  1    0   0    4    26    0   38    2    7   10    0  1644   13   2   0  84
  2    0   0    0     6    0    8    0    1    3    0   160    0   1   0  99
  3    0   0    0     2    1    0    0    0    0    0     0    0   0   0 100
  4    0   0  238   451  406  171    5   13    6    0   972    4   3   0  93
  5    0   0    6    40    4   72    0    1    0    0    52    0   0   0 100
  6    0   0    4    17    0   33    0    1    6    0   243    1   1   0  98
  7    0   0    1     6    0   10    0    0    0    0     0    0   0   0 100
 16    2   0 187838    42    0   72   12    5   37    0  5551   35  33   0  32
 17    0   0    5    21    0   40    0    4    0    0     4    0   0   0 100
 18    0   0    3     7    0   13    0    0    0    0     0    0   0   0 100
 19    0   0    1    21    0   40    0    0    0    0     0    0   0   0 100
 20    0   0    2    18    0   35    0    7   14    0   532    2   0   0  98
 21    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 22    0   0    0     2    1    0    0    0    0    0     0    0   0   0 100
 23    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 24    2   0  128    32    0   57    3    9   90    0  4981   33   8   0  59
 25    0   0    1     4    0    7    0    2    0    0    39    0   0   0 100
 26    0   0    0     1    0    2    0    1    0    0     0    0   0   0 100
 27    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 28    0   0   94    26    0   47    1    5   76    0  1136    4   6   0  89
 29    0   0    0     3    0    4    0    1    0    0     2    0   0   0 100
 30    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 31    0   0    0     1    0    0    0    0    0    0     0    0   0   0 100
 32    0   0  561    38    0   66    5   12   16    0  3872   23  15   0  62
 33    0   0    8    78    0  155    2    3    1    0    38    0   0   0 100
 34    0   0    0     2    0    3    0    0    0    0     0    0   0   0 100
 35    0   0    3     9    0   17    0    0    0    0     0    0   0   0 100
 36    0   0   15    60    0  115    6    6    6    0  1457    8   2   0  90
 37    0   0    1     4    0    7    0    0    0    0     0    0   0   0 100
 38    0   0    5    28    0   53    0    0    0    0     0    0   0   0 100
 39    0   0    1     4    0    6    0    0    0    0     0    0   0   0 100

4. who is making all the cross calls
# dtrace -n 'sysinfo:::xcalls { @num[pid,execname] = count(); }'
dtrace: description 'sysinfo:::xcalls ' matched 4 probes
^C

        7  svc.startd                                                        1
      402  nscd                                                              1
     1384  snmpd                                                             1
     1759  cpd                                                               1
     1921  cpmad                                                             1
[...snip...]
     2164  fwd                                                              26
     1966  fwd                                                              28
     1927  fwd                                                              31
     2055  cpd                                                              45
     2144  fwd                                                              45
      826  snmpd                                                           102
        3  fsflush                                                         190
    21288  fwm                                                             479
    21289  fw_loader                                                       495
    21286  fwm                                                            1093
    21287  dtrace                                                         1430
        0  sched                                                         26601
     2153  fwd                                                          337655

5. systems calls by this process.
# dtrace -n 'syscall:::entry /pid == 2153/ { @num[probefunc] = count(); }'
dtrace: description 'syscall:::entry ' matched 229 probes
^C

  getpid                                                            1
  pollsys                                                          74
  pread64                                                          87
  pwrite64                                                        134
  lwp_sigmask                                                     148
  llseek                                                        11010
  lseek                                                         18126
  read                                                          19654
  write                                                         57857

6. What is it reading & writing? This first column is the file descriptor 
number, the second the number of times it was seen:

# dtrace -n 'syscall::read:entry /pid == 2153/ { @[arg0] = count(); }'
dtrace: description 'syscall::read:entry ' matched 1 probe
^C

               47                1
               49               12
               83               12
               73               17
               51               18
               50               21
               53               24
               56               24
               57              282
               58              381
               55              401
               78             5742
               70             5777
               77            12835

# dtrace -n 'syscall::write:entry /pid == 2153/ { @[arg0] = count(); }'
dtrace: description 'syscall::write:entry ' matched 1 probe
^C

               13                4
               82                7
               78              158
               70              160
               24            21201
               23            24974
               22            28747

# pfiles 2153
2153:   /opt/CPmds-R65/customers/XXX/CPsuite-R65/fw1/bin/fwd -n
  Current rlimit: 65536 file descriptors

[...snip...]

  22: S_IFREG mode:0644 dev:256,65541 ino:96177 uid:0 gid:0 size:1197740174
      O_RDWR|O_CREAT|O_TRUNC
      /var/opt/CPmds-R65/customers/XXX/CPsuite-R65/fw1/log/fw.log
  23: S_IFREG mode:0644 dev:256,65541 ino:96178 uid:0 gid:0 size:47337100
      O_RDWR|O_CREAT|O_TRUNC
      /var/opt/CPmds-R65/customers/XXX/CPsuite-R65/fw1/log/fw.logptr
  24: S_IFREG mode:0644 dev:256,65541 ino:96179 uid:0 gid:0 size:39200652
      O_RDWR|O_CREAT|O_TRUNC
      /var/opt/CPmds-R65/customers/XXX/CPsuite-R65/fw1/log/fw.loginitial_ptr

[...snip...]

  70: S_IFSOCK mode:0666 dev:332,0 ino:25771 uid:0 gid:0 size:0
      O_RDWR|O_NDELAY
        SOCK_STREAM
        
SO_REUSEADDR,SO_OOBINLINE,SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(0.0.193.232)
        sockname: AF_INET 1.1.1.42  port: 18184
        peername: AF_INET 1.1.1.5  port: 55011

[...snip...]

  77: S_IFREG mode:0644 dev:256,65541 ino:125222 uid:0 gid:0 size:2145391731
      O_RDONLY
      
/var/opt/CPmds-R65/customers/XXX/CPsuite-R65/fw1/log/2009-06-08_215011_153.log
  78: S_IFSOCK mode:0666 dev:332,0 ino:44055 uid:0 gid:0 size:0
      O_RDWR|O_NDELAY
        SOCK_STREAM
        
SO_REUSEADDR,SO_OOBINLINE,SO_SNDBUF(49152),SO_RCVBUF(49640),IP_NEXTHOP(0.0.193.232)
        sockname: AF_INET 1.1.1.42  port: 18184
        peername: AF_INET 1.1.1.5  port: 55147

[...snip...]


So the file being written is fw.log which is to be expected. But the reader is fd 
77 2009-06-08_215011_153.log closely followed by fd’s 70 & 78 which are tcp 
connections to log reporting server. So an educated guess would say that this is 
the file that reporting server is trying to index. Checking the reporting server 
confirms this.

regards

Matthew Flanagan
--
http://wadofstuff.blogspot.com


_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] poor application performance - very high cross-calls

Reply via email to