On 10/03/08 13:20, Elad Lahav wrote:
>> (In your first posting you said mpstat was used, not vmstat, so
>> I will assume mpstat).
> Oops, of course, I meant mpstat...
> 
>> If mpstat shows idle time, then during the idle time, no thread
>> is runnable from a high-level, traditional operating system point
>> of view.
> Not sure about that. mpstat shows idle time for a virtual processor, 
> representing a single hardware strand. Are you suggesting that this 
> implies that all 4 threads on the core are idle?

I meant: if mpstat shows idle time for a CPU, then no thread is
runnable for that CPU during the idle time.  In your example, no
other threads are running on the core, so mpstat must look like this:

% mpstat
CPU  usr+sys  idle
0     50%     50%           \
1     0%      100%           \   these 4 CPUS share a core pipeline
2     0%      100%           /
3     0%      100%          /
...

hence 50% of the time, there are no runnable software threads for
the core and all 4 strands are idle.

In case we are using different terminology, here is mine:

   Solaris CPU == virtual processor == hardware strand
   mpstat shows each CPU individually.
   There are 4 CPUs per core, which share an execution pipeline (on the T1000).
   The OS assigns a software thread to a CPU when the software thread
     becomes runnable.

- Steve


> 
>> In your example, mpstat showed a 50% busy server thread, and
>> a 50% busy interrupt CPU, on different cores.  Maybe the critical
>> path runs from the interrupt thread to to app thread with no
>> overlap, hence each can only stay half busy.
> Perhaps.
> 
>  > Or maybe you have a bottleneck on the client side.
> No. Substituting the T1000 for a Xeon server yields a 1 Gbps throughput, 
> so it's definitely not the client's fault.
> 
>> Also let me add the standard CMT advice: CMT is designed for
>> aggregate thruput and gives the best performance when many threads
>> run concurrently.  Your single threaded test is not the best example
>> for this architecture.
> I am well aware of that. I am only trying to figure out the capabilities 
> of strands and cores, and the effect of different placement policies, in 
> order to come up with good ways of parallelising network servers.
> 
> At this point I am inclined to believe it has something to do with TCP, 
> as alluded to by David in a previous response. I have a similar problem 
> when using Linux, and there moving to UDP yields much better results. I 
> will give it a try on Solaris now.
> 
> Thanks for your help,
> --Elad
> 

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to