(In your first posting you said mpstat was used, not vmstat, so I will assume mpstat).
mpstat shows the percent of time that threads spend running on the CPU, where each CPU is a hardware strand, and 4 strands share an instruction pipeline. At the lowest level, some of this time is spent executing instructions in the shared pipeline, and some time is spent waiting for long latency instructions such as memory loads to complete, but mpstat does not distinguish between these low level states. If mpstat shows idle time, then during the idle time, no thread is runnable from a high-level, traditional operating system point of view. The threads are sleeping, eg perhaps blocked on a lock or condition variable. In your example, mpstat showed a 50% busy server thread, and a 50% busy interrupt CPU, on different cores. Maybe the critical path runs from the interrupt thread to to app thread with no overlap, hence each can only stay half busy. Or maybe you have a bottleneck on the client side. Hopefully I have given you sufficient information to interpret the data and find the problem. If you are asking more generally how to predict headroom when multiple threads/CPUs share a core instruction pipeline using cpustat and mpstat, then the answers are in Ravi's blog. Also let me add the standard CMT advice: CMT is designed for aggregate thruput and gives the best performance when many threads run concurrently. Your single threaded test is not the best example for this architecture. - Steve On 10/03/08 11:21, Elad Lahav wrote: > Thanks, Steve, but the blog does not answer my question. > With vmstat reporting 50% idle time, and the core greatly under-utilised > (about 25%), how do you determine that a *single thread* has reached its > limit? I would expect 0% idle time on vmstat in that case. > The main issue is that the number of instructions-per-second that a > thread can execute depends on its own CPI, but also on the execution > pattern of the other threads in the core. So a thread that can do 700 > instructions/sec, if running alone on the core, will do much less when > sharing the core with other threads. Thus, you're left with mpstat to > tell you whether the thread is saturated. Only I'm not sure whether > mpstat is doing the right thing. > > --Elad > > Steve Sistare wrote: >> See Ravi Talashikar's blog for an explanation of CPU vs core >> utilization on CMT architectures such as the T1000: >> http://blogs.sun.com/travi/entry/ultrasparc_t1_utilization_explained >> >> - Steve > _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org