(In your first posting you said mpstat was used, not vmstat, so
I will assume mpstat).

mpstat shows the percent of time that threads spend running on the
CPU, where each CPU is a hardware strand, and 4 strands share an
instruction pipeline.  At the lowest level, some of this time is
spent executing instructions in the shared pipeline, and some time is
spent waiting for long latency instructions such as memory loads to
complete, but mpstat does not distinguish between these low level
states.

If mpstat shows idle time, then during the idle time, no thread
is runnable from a high-level, traditional operating system point
of view.  The threads are sleeping, eg perhaps blocked on a lock or
condition variable.

In your example, mpstat showed a 50% busy server thread, and
a 50% busy interrupt CPU, on different cores.  Maybe the critical
path runs from the interrupt thread to to app thread with no
overlap, hence each can only stay half busy.  Or maybe you have
a bottleneck on the client side.  Hopefully I have given you
sufficient information to interpret the data and find the problem.

If you are asking more generally how to predict headroom
when multiple threads/CPUs share a core instruction pipeline
using cpustat and mpstat, then the answers are in Ravi's blog.

Also let me add the standard CMT advice: CMT is designed for
aggregate thruput and gives the best performance when many threads
run concurrently.  Your single threaded test is not the best example
for this architecture.

- Steve


On 10/03/08 11:21, Elad Lahav wrote:
> Thanks, Steve, but the blog does not answer my question.
> With vmstat reporting 50% idle time, and the core greatly under-utilised 
> (about 25%), how do you determine that a *single thread* has reached its 
> limit? I would expect 0% idle time on vmstat in that case.
> The main issue is that the number of instructions-per-second that a 
> thread can execute depends on its own CPI, but also on the execution 
> pattern of the other threads in the core. So a thread that can do 700 
> instructions/sec, if running alone on the core, will do much less when 
> sharing the core with other threads. Thus, you're left with mpstat to 
> tell you whether the thread is saturated. Only I'm not sure whether 
> mpstat is doing the right thing.
> 
> --Elad
> 
> Steve Sistare wrote:
>> See Ravi Talashikar's blog for an explanation of CPU vs core
>> utilization on CMT architectures such as the T1000:
>>   http://blogs.sun.com/travi/entry/ultrasparc_t1_utilization_explained
>>
>> - Steve
> 

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to