Dear folks, While investigating a performance slowdown (device kb/s write rate decline) at a cutomer site running a v440 with Solaris 9 04/04, I observed that the vmstat's run queue periodically increased from 0 or 1 to 4 and would oscillate in this manner. Immediately after the run queue increase, the CPU utilization would increase to a maximum of 90% but would never touch 100% -- even though the run queue increases pointed to CPU saturation. During this time, the iostat based write rate for these devices decreased at this time i.e. there was a correlation between the increase in CPU saturation and decrease in iostat write rates (Kb/s) to the various devices.
It was also the case that this server was experiencing a lot of network activity. The lockstat lock profile at the top looked like this: Count indv cuml rcnt spin Lock Caller ------------------------------------------------------------------------------- 295378 52% 52% 1.00 1 0x30006045a28 putq+0x40 92614 16% 69% 1.00 1 0x30006045a28 getq_noenab+0x18 38674 7% 76% 1.00 1 0x30006045a28 queue_service+0x8 13138 2% 78% 1.00 1 vph_mutex+0x2000 page_hashin+0x88 while the lockstat kernel call profile looked like this: Count indv cuml rcnt nsec CPU+PIL Caller ------------------------------------------------------------------------------- 3694 2% 2% 1.00 620 cpu[2] default_copyin+0x1b0 3673 2% 3% 1.00 611 cpu[1] default_copyin+0x1b0 3482 2% 5% 1.00 639 cpu[3] default_copyin+0x1b0 3432 2% 7% 1.00 1125 cpu[1] i_ddi_splx+0x1c 3388 2% 8% 1.00 1122 cpu[2] i_ddi_splx+0x1c 3231 1% 10% 1.00 993 cpu[3] i_ddi_splx+0x1c 3083 1% 11% 1.00 477 cpu[0]+6 pci_pbm_dma_sync+0x90 Given that we appeared to be CPU bound, we took steps to reduce the CPU usage by our own device driver and the io writes improved significantly. Even though our run queue's indicated CPU saturation, I noted that vmstat's CPU tilization never went all the way to 100% It hovered around 90% or so but never went up. Of course, my vmstat and iostat tracing was at a granularity of 5 seconds intervals and I could not run dtrace here. The excellent Solaris Performance book by McDougall, Mauro & Gregg discusses the differences between utilzation and saturation and recommends watching out for CPU saturation over utilization. Is this discrepancy between CPU tilization and CPU saturation normal ? Thanks, Sri This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org