Glen Gunselman wrote:
Bob,
Thanks for the info. Sorry about taking so long to reply.
I think I answered most (all?) of your questions in my reply to
Jim Mauros.
I do have a question for you. You said "Note that your prstat
data excerpt is not looking at per-thread statistics, but only
per-process.". The prstat is from a prstat -amL. The top eight lines
have only two PIDs - 5617 and 6084.
Oops! I goofed earlier by not noticing that this *was* per-thread
(LWPID) versus per-process (NLWP) data.
D'oh! Sorry about that! A late-night blunder on my part!
Cheers,
-- Bob
PID USERNAME USR SYS TRP TFL DFL LCK
SLP LAT VCX ICX SCL SIG PROCESS/LWPID
5617 cognos8 53 0.5 0.0 0.0 0.0 2.1 0.0 45 1K 200 3K 0
BIBusTKServe/18
5617 cognos8 51 0.5 0.0 0.0 0.0 3.6 0.0 45 1K 274 3K 0
BIBusTKServe/17
6084 cognos8 43 0.6 0.0 0.0 0.0 1.9 0.0 54 2K 222 5K 0
BIBusTKServe/20
6084 cognos8 43 0.6 0.0 0.0 0.0 1.1 0.0 55 1K 244 4K 0
BIBusTKServe/15
6084 cognos8 39 0.6 0.0 0.0 0.0 1.8 0.0 59 2K 212 4K 0
BIBusTKServe/22
5617 cognos8 39 0.4 0.0 0.0 0.0 1.4 0.0 59 1K 223 3K 0
BIBusTKServe/22
6084 cognos8 35 0.4 0.0 0.0 0.0 1.1 0.0 64 1K 262 2K 0
BIBusTKServe/19
! 5617 cognos8 34 0.4 0.0 0.0 0.0 2.2 0.0 64 1K 465 2K 0
BIBusTKServe/23
What does "per-thread" output look like?
Here's the remaining lines from the prstat -amL.
NLWP USERNAME SIZE RSS MEMORY TIME
CPU
282 cognos8 171G 134G 18% 3:14:22 74
1316 oracle 721G 592G 82% 34:49:50 13
97 root 356M 198M 0.0% 1:24:18 0.8
6 sysnav 8576K 5448K 0.0% 0:00:03 0.1
7 gunselmg 15M 12M 0.0% 0:00:00 0.0
1 smmsp 4432K 1240K 0.0% 0:00:00 0.0
1 daemon 2512K 1080K 0.0% 0:00:00 0.0
Total: 278 processes, 1710 lwps, load
averages: 20.72, 13.21, 6.74
Thanks again for the help,
Glen Gunselman
Systems Software Specialist
TCS
Emporia State University
Glen:
Some comments offered inline below ...
We have an overloaded
server (V490 with one CPU board) - CPU bound. Here is a sample prstat
-mL taken during a time of high load(uptime
Total: 278 processes, 1710 lwps, load averages: 20.72, 13.21, 6.74):
PID USERNAME USR SYS TRP
TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
5617 cognos8 53 0.5 0.0 0.0 0.0 2.1 0.0 45 1K 200 3K 0
BIBusTKServe/18
5617 cognos8 51 0.5 0.0 0.0 0.0 3.6 0.0 45 1K 274 3K 0
BIBusTKServe/17
6084 cognos8 43 0.6 0.0 0.0 0.0 1.9 0.0 54 2K 222 5K 0
BIBusTKServe/20
6084 cognos8 43 0.6 0.0 0.0 0.0 1.1 0.0 55 1K 244 4K 0
BIBusTKServe/15
6084 cognos8 39 0.6 0.0 0.0 0.0 1.8 0.0 59 2K 212 4K 0
BIBusTKServe/22
5617 cognos8 39 0.4 0.0 0.0 0.0 1.4 0.0 59 1K 223 3K 0
BIBusTKServe/22
6084 cognos8 35 0.4 0.0 0.0 0.0 1.1 0.0 64 1K 262 2K 0
BIBusTKServe/19 5617 cognos8 34 0.4 0.0 0.0 0.0 2.2 0.0 64 1K
465 2K 0 BIBusTKServe/23
29514 oracle 28 1.2 0.1 0.0 0.0 0.0 8.6 62 217 990 899 0 oracle/1
29948 root 2.4 0.4 0.0 0.0 0.0 0.0 77 20 109 561 961 0
cfagent/1
5610 oracle 1.5 0.5 0.0 0.0 0.0 0.0 98 0.1 3 8 871 0 oracle/1
942 oracle 1.2 0.6 0.0 0.0 0.0 0.0 98 0.0 15 50 506 0 oracle/1
9378 root 0.4 1.1 0.1 0.0 0.0 0.0 98 0.9 40 9 994 0 prstat/1
1475 oracle 1.1 0.2 0.4 0.0 0.0 0.0 98 0.2 111 55 945 0
emagent/3047304
11646 oracle 0.8 0.0 0.0 0.0 0.0 0.0 91 8.7 1 45 80 0 java/56
1147! 9 oracle 0.6 0.1 0.0 0.0 0.0 0.0 98 1.0 4 4 615 0
oracle/1
10520 oracle 0.6 0.0 0.0 0.0 0.0 0.0 98 1.4 5 0 45 5
nmccollector/1
835 sysnav 0.1 0.2 0.1 0.0 0.0 0.0 57 42 19 240 471 0
bb-local.sh/1
7375 oracle 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 9 3 192 0 oracle/1
11712 oracle 0.2 0.0 0.0 0.0 0.0 0.0 100 0.0 8 2 178 0 oracle/1
11815 oracle 0.2 0.0 0.0 0.0 0.0 100 0.0 0.2 1 3 18 0 java/37
576 root 0.1 0.1 0.0 0.0 0.0 0.0 100 0.1 331 1 1K 0 nscd/11
17855 oracle 0.1 0.0 0.0 0.0 0.0 100 0.0 0.1 5 0 5 ; 0 java/2
11805 oracle 0.1 0.1 0.0 0.0 0.0 0! .0 96 3.8 4 7 62 2 perl/1
11649 oracle 0.1 0.0 0.0 0.0 0.0 0.0 100 0.0 9 0 118 0 oracle/1
11780 oracle 0.0 0.1 0.0 0.0 0.0 0.0 92 8.3 52 0 354 47
webcached/1
1 root 0.0 0.1 0.0 0.0 0.0 0.0 100 0.2 13 0 361 14 init/1
4987 cognos8 0.0 0.1 0.0 0.0 0.0 0.0 57 43 338 4 232 0 java/5
4972 cognos8 0.1 0.0 0.0 0.0 0.0 0.0 91 8.5 68 0 77 0
cogbootstrap/3
17855 oracle 0.0 0.1 0.0 0.0 0.0 0.0 51 49 312 2 209 0 java/5
From looking at the LAT column how to I compute the CPU
resources needed to reduce LAT to more "normal levels".
First, I should say that tuning LAT should not be a performance tuning
objective, and that there is no such thing as a generic "normal" value
for it. Your goals should be measured in workload performance terms,
and absent that - tuning any other observed metric could be a waste of
time and money. Could you give some insight into the business problem
you are trying to solve in some quantitative terms, and make some
assessment of where you are relative to that goal?
Note that your prstat data excerpt is not looking at per-thread
statistics, but only per-process. Therefore, we really do not know how
many compute-bound threads you actually have. Since this process-level
LAT data is in terms of "percent of elapsed time", it is not really
useful for estimating latent demand for CPU. We would gain more insi!
ght from 'prstat -mL' data. In turn, that data is best interpreted in
the light of matching mpstat and ps data, and we will often capture
other data to complete the picture. Adding CPUs will not benefit much
past the point where each CPU-hog thread essentially has a dedicated
core and the remaining miscellaneous demand is well-served. On the
other hand, your business needs might well be met using fewer worker
threads to begin with - and fewer threads might exhibit less
contention. One would need to know more the function and design of
BIBusTKserve.
There's a lot here that a performance analyst would like to know in a
case like this, such as whether or not your Oracle is configured
ideally, where it fits in the overall workload, and what function that
high-CPU oracle process is performing. I'm always curious in a general
way to know how much of the aggregate CPU usage is going to spin-locks
and other synchronization activities, both! at the OS and application
levels.
I'll echo Jim Mauros's senti
ment to follow-up with more answers.
Best regards,
-- Bob
Page 24 of Solaris Performance and Tools includes the
following statement referring to LAT:
"This is an extremely useful metric--we can use it to estimate
the potential speedup for a thread if more CPU resources are added ..."
I have been unable to find any information on how to turn LAT
into CPU resources. I'm reluctant to use USR + SYS (370.5 the top 9
processes) + LAT (507 for the same top 9 processes) / 100. This seems
way too simple.
Thanks
gleng
Glen Gunselman
Systems Software Specialist
TCS
Emporia State University
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org
|
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org