Dear folks,

While investigating a performance slowdown (device  kb/s write rate decline) at 
a cutomer site running a v440 with Solaris 9 04/04, I observed that the 
vmstat's run queue periodically increased from 0 or 1 to 4 and would oscillate 
in this manner. Immediately after the run queue increase, the CPU utilization 
would increase to a maximum of 90% but would never touch 100% -- even though 
the run queue increases pointed to CPU saturation. During this time, the iostat 
based write rate for these devices decreased at this time i.e. there was a 
correlation between the increase in CPU saturation and decrease in iostat write 
rates (Kb/s) to the various devices.

 It was also the case that this server was experiencing a lot of network 
activity. The lockstat lock profile at the top looked like this:

Count indv cuml rcnt     spin Lock                   Caller
-------------------------------------------------------------------------------
295378  52%  52% 1.00        1 0x30006045a28          putq+0x40
92614  16%  69% 1.00        1 0x30006045a28          getq_noenab+0x18
38674   7%  76% 1.00        1 0x30006045a28          queue_service+0x8
13138   2%  78% 1.00        1 vph_mutex+0x2000       page_hashin+0x88


while the lockstat kernel call profile looked like this:

Count indv cuml rcnt     nsec CPU+PIL                Caller
-------------------------------------------------------------------------------
 3694   2%   2% 1.00      620 cpu[2]                 default_copyin+0x1b0
 3673   2%   3% 1.00      611 cpu[1]                 default_copyin+0x1b0
 3482   2%   5% 1.00      639 cpu[3]                 default_copyin+0x1b0
 3432   2%   7% 1.00     1125 cpu[1]                 i_ddi_splx+0x1c
 3388   2%   8% 1.00     1122 cpu[2]                 i_ddi_splx+0x1c
 3231   1%  10% 1.00      993 cpu[3]                 i_ddi_splx+0x1c
 3083   1%  11% 1.00      477 cpu[0]+6               pci_pbm_dma_sync+0x90

Given that we appeared to be CPU bound, we took steps to reduce the CPU usage 
by our own device driver and the io writes improved significantly.

Even though our run queue's indicated CPU saturation, I noted that vmstat's CPU 
tilization never went all the way to 100%  It hovered around 90% or so but 
never went up. Of course, my vmstat and iostat tracing was at a granularity of 
5 seconds intervals and I could not run dtrace here.

The excellent Solaris Performance book by McDougall, Mauro & Gregg discusses 
the differences between utilzation and saturation and recommends watching out 
for CPU saturation over utilization.

Is this discrepancy between CPU tilization and CPU saturation normal ? 

Thanks,


Sri
 
 
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to