Our application consists of multiple cooperating multithreaded processes. The application is both latency and throughput sensitive. Since it originated long ago, several artifacts are less than optimal, but thats the way it is for awhile longer. Anyway, I digress. Most threads run in the TS class with boosted "nice" values so as to limit the possible interference from the the occasional background task; the exceptions are a few very lightweight, infrequent, but urgent ones that run in the RT class. Additionally, hires tick is set. As a result, the default rechoose_interval period is 3ms rather than the normal 30ms. The curious thing is that on a 12 cpu system I can observe that some cpus are much busier than others, and latency as observed via prstat -Lm is higher than expected on a lightly loaded system. I presume this is an artifact of threads queueing up for rechoose_interval on the last cpu they ran on instead of migrating. This seems to be born out by the fact that I can use psrset to create a set containing one of the otherwise idle cpus, bind a process to it, then delete the processor set and see that the previously bound process appears to stick on the previously idle cpu. OK, so far, but the other processes still seem to be contending for busy cpus, which is inoptimal for our application. Now comes the real puzzler, to me at least. I set rechoose_interval=0 in /etc/system, reboot, take it from the top. I though this would result in the load being spread out over time as threads migrated to and then stuck to uncontended cpus, but thats not what I see. Here is mpstat snapshot: CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl 0 14 0 211 2976 1957 2245 84 375 156 0 20874 17 9 0 74 1 0 0 149 98 2 2612 89 583 73 0 19032 16 11 0 73 2 12 0 184 86 6 2523 76 589 76 0 17215 13 9 0 77 3 0 0 96 650 581 2387 64 530 85 0 13249 11 7 0 82 8 56 0 11 6 1 581 2 227 25 0 1401 2 2 0 97 9 0 0 6 4 1 550 0 111 8 0 398 1 1 0 98 10 5 0 8 28 25 546 0 44 14 0 165 0 1 0 99 11 0 0 16 390 388 219 0 23 18 0 75 0 1 0 99 16 52 0 13 10 7 223 1 22 5 0 212 0 1 0 99 17 0 0 5 4 1 322 0 34 5 0 525 0 1 0 99 18 0 0 15 5 1 319 1 86 11 0 1558 1 1 0 98 19 1 0 50 8 1 552 4 192 22 0 4406 4 2 0 94
Any thoughts? This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org