Re: [perf-discuss] prstat LAT - how to interpret

Jim Mauro Mon, 04 Dec 2006 15:52:19 -0800


This microstate represents the percentage of time the threads were sitting
on a run queue (runnable) waiting for their turn on a CPU. The BIBusTKServer
processes are either running in USR mode (while on a CPU) or waiting for
a CPU (for the most part). Each process has several threads (on the order of

20), so each prstat row represents the CPU wait time for all the threadsin the

process. Each server is also doing about 3k-5k system calls per second, but

that must be non-IO syscalls, since there's no appreciable SYS time orSLP time.

So....what does the "r" column in vmstat look like? That's thesystem-wide view

of run queue depth. Is is consistently at or over 100?

How many CPUs are on this system?

Your question is a good one, but it's very difficult to predict how much the
addition of CPUs (or a change to faster CPUs) will ultimately impact
the statistics, and, much more importantly, the application performance.
Which brings me to the next question - is the application performing well,
or are you chasing a performance problem?

Not knowing anything more about the workload, it certainly seems obviousthat

more processors will help, in terms of reducing CPU wait time. When that
happens, more threads can run concurrently, and how that ultimately helps

(or hurts) throughput depends on application design, locking anddependencies.

My knee-jerk reaction is that I would want at least a 16-way system torun the

BIBusTKServer processes. Looks like a decent candidate for the T2000

(32 threads), assuming it meets other contraints (next to zero floatingpoint,

for example).

- What system is this running on now?
- What version of Solaris?
- How is overall workload/application performance, currently?

Theoretically, with enough CPUs (cores) to run each thread, LAT time
would drop to near zero - the problem is we don't know if all those
running threads will contend on something else (at least I don't :^).
It's unrealistic (probably) to suggest 160 or so cores!

Also, does the workload require 8 threaded BIBusTKServer processes?
Not knowing anything about the design, it's possible you could run
fewer Server processes that would reduce CPU contention while
maintaining throughput (whether that's possible or not is of course
completely dependent on the application design).

HTH - please follow up with us on the questions about, and let's
see where it goes.

/jim


Glen Gunselman wrote:

We have an overloaded server (V490 with one CPU board) - CPU bound.Here is a sample prstat -mL taken during a time of high load(uptimeTotal: 278 processes, 1710 lwps, load averages: 20.72, 13.21, 6.74):PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIGPROCESS/LWPID5617 cognos8 53 0.5 0.0 0.0 0.0 2.1 0.0 45 1K 200 3K 0BIBusTKServe/185617 cognos8 51 0.5 0.0 0.0 0.0 3.6 0.0 45 1K 274 3K 0BIBusTKServe/176084 cognos8 43 0.6 0.0 0.0 0.0 1.9 0.0 54 2K 222 5K 0BIBusTKServe/206084 cognos8 43 0.6 0.0 0.0 0.0 1.1 0.0 55 1K 244 4K 0BIBusTKServe/156084 cognos8 39 0.6 0.0 0.0 0.0 1.8 0.0 59 2K 212 4K 0BIBusTKServe/225617 cognos8 39 0.4 0.0 0.0 0.0 1.4 0.0 59 1K 223 3K 0BIBusTKServe/226084 cognos8 35 0.4 0.0 0.0 0.0 1.1 0.0 64 1K 262 2K 0BIBusTKServe/195617 cognos8 34 0.4 0.0 0.0 0.0 2.2 0.0 64 1K 465 2K 0BIBusTKServe/23
 29514 oracle    28 1.2 0.1 0.0 0.0 0.0 8.6  62 217 990 899   0 oracle/1
 29948 root     2.4 0.4 0.0 0.0 0.0 0.0  77  20 109 561 961   0 cfagent/1
  5610 oracle   1.5 0.5 0.0 0.0 0.0 0.0  98 0.1   3   8 871   0 oracle/1
   942 oracle   1.2 0.6 0.0 0.0 0.0 0.0  98 0.0  15  50 506   0 oracle/1
  9378 root     0.4 1.1 0.1 0.0 0.0 0.0  98 0.9  40   9 994   0 prstat/1
1475 oracle 1.1 0.2 0.4 0.0 0.0 0.0 98 0.2 111 55 945 0emagent/3047304
 11646 oracle   0.8 0.0 0.0 0.0 0.0 0.0  91 8.7   1  45  80   0 java/56
 11479 oracle   0.6 0.1 0.0 0.0 0.0 0.0  98 1.0   4   4 615   0 oracle/1
10520 oracle 0.6 0.0 0.0 0.0 0.0 0.0 98 1.4 5 0 45 5nmccollector/1835 sysnav 0.1 0.2 0.1 0.0 0.0 0.0 57 42 19 240 471 0bb-local.sh/1
  7375 oracle   0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   9   3 192   0 oracle/1
 11712 oracle   0.2 0.0 0.0 0.0 0.0 0.0 100 0.0   8   2 178   0 oracle/1
 11815 oracle   0.2 0.0 0.0 0.0 0.0 100 0.0 0.2   1   3  18   0 java/37
   576 root     0.1 0.1 0.0 0.0 0.0 0.0 100 0.1 331   1  1K   0 nscd/11
 17855 oracle   0.1 0.0 0.0 0.0 0.0 100 0.0 0.1   5   0   5   ; 0 java/2
 11805 oracle   0.1 0.1 0.0 0.0 0.0 0.0  96 3.8   4   7  62   2 perl/1
 11649 oracle   0.1 0.0 0.0 0.0 0.0 0.0 100 0.0   9   0 118   0 oracle/1
11780 oracle 0.0 0.1 0.0 0.0 0.0 0.0 92 8.3 52 0 354 47webcached/1
     1 root     0.0 0.1 0.0 0.0 0.0 0.0 100 0.2  13   0 361  14 init/1
  4987 cognos8  0.0 0.1 0.0 0.0 0.0 0.0  57  43 338   4 232   0 java/5
4972 cognos8 0.1 0.0 0.0 0.0 0.0 0.0 91 8.5 68 0 77 0cogbootstrap/3
 17855 oracle   0.0 0.1 0.0 0.0 0.0 0.0  51  49 312   2 209   0 java/5
From looking at the LAT column how to I compute the CPU resourcesneeded to reduce LAT to more "normal levels".Page 24 of Solaris Performance and Tools includes the followingstatement referring to LAT:"This is an extremely useful metric--we can use it to estimate thepotential speedup for a thread if more CPU resources are added ..."I have been unable to find any information on how to turn LAT into CPUresources. I'm reluctant to use USR + SYS (370.5 the top 9 processes)+ LAT (507 for the same top 9 processes) / 100. This seems way toosimple.Thanks
gleng
Glen Gunselman
Systems Software Specialist
TCS
Emporia State University
------------------------------------------------------------------------

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Re: [perf-discuss] prstat LAT - how to interpret

Reply via email to