[perf-discuss] Puzzling scheduler behavior

David McDaniel Wed, 31 Aug 2005 14:14:18 -0700

Our application consists of multiple cooperating multithreaded processes. The 
application is both latency and throughput sensitive. Since it originated long 
ago, several artifacts are less than optimal, but thats the way it is for 
awhile longer. Anyway, I digress. Most threads run in the TS class with boosted 
"nice" values so as to limit the possible interference from the the occasional 
background task; the exceptions are a few very lightweight, infrequent, but 
urgent ones that run in the RT class. Additionally, hires tick is set. As a 
result, the default rechoose_interval period is 3ms rather than the normal 30ms.
  The curious thing is that on a 12 cpu system I can observe that some cpus are 
much busier than others, and latency as observed via prstat -Lm is higher than 
expected on a lightly loaded system. I presume this is an artifact of threads 
queueing up for rechoose_interval on the last cpu they ran on instead of 
migrating. This seems to be born out by the fact that I can use psrset to 
create a set containing one of the otherwise idle cpus, bind a process to it, 
then delete the processor set and see that the previously bound process appears 
to stick on the previously idle cpu. OK, so far, but the other processes still 
seem to be contending for busy cpus, which is inoptimal for our application.
  Now comes the real puzzler, to me at least. I set rechoose_interval=0 in 
/etc/system, reboot, take it from the top. I though this would result in the 
load being spread out over time as threads migrated to and then stuck to 
uncontended cpus, but thats not what I see. Here is mpstat snapshot:
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    14    0   211  2976 1957 2245   84  375  156    0 20874   17   9   0  74
  1     0     0   149    98    2 2612   89  583   73    0 19032   16  11   0  73
  2   12     0   184    86    6 2523   76  589   76    0 17215   13   9   0  77
  3     0     0    96   650  581 2387   64  530   85    0 13249   11   7   0  82
  8   56     0    11     6    1  581    2  227   25    0  1401    2   2   0  97
  9     0     0     6     4    1  550    0  111    8    0   398    1   1   0  98
 10    5     0     8    28   25  546    0   44   14    0   165    0   1   0  99
 11    0     0   16   390  388  219    0   23   18    0    75    0   1   0  99
 16  52     0   13    10    7  223    1   22    5    0   212    0   1   0  99
 17    0     0    5     4    1  322    0   34    5    0   525    0   1   0  99
 18    0     0   15     5    1  319    1   86   11    0  1558    1   1   0  98
 19    1     0   50     8    1  552    4  192   22    0  4406    4   2   0  94


  Any thoughts?
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

[perf-discuss] Puzzling scheduler behavior

Reply via email to