> (In your first posting you said mpstat was used, not vmstat, so
> I will assume mpstat).
Oops, of course, I meant mpstat...

> If mpstat shows idle time, then during the idle time, no thread
> is runnable from a high-level, traditional operating system point
> of view.
Not sure about that. mpstat shows idle time for a virtual processor, 
representing a single 
hardware strand. Are you suggesting that this implies that all 4 threads on the 
core are idle?

> In your example, mpstat showed a 50% busy server thread, and
> a 50% busy interrupt CPU, on different cores.  Maybe the critical
> path runs from the interrupt thread to to app thread with no
> overlap, hence each can only stay half busy.
Perhaps.

 > Or maybe you have a bottleneck on the client side.
No. Substituting the T1000 for a Xeon server yields a 1 Gbps throughput, so 
it's 
definitely not the client's fault.

> Also let me add the standard CMT advice: CMT is designed for
> aggregate thruput and gives the best performance when many threads
> run concurrently.  Your single threaded test is not the best example
> for this architecture.
I am well aware of that. I am only trying to figure out the capabilities of 
strands and 
cores, and the effect of different placement policies, in order to come up with 
good ways 
of parallelising network servers.

At this point I am inclined to believe it has something to do with TCP, as 
alluded to by 
David in a previous response. I have a similar problem when using Linux, and 
there moving 
to UDP yields much better results. I will give it a try on Solaris now.

Thanks for your help,
--Elad

_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to