Hi All, I have a T2000 system running a legacy application(having many processes >80). The sar(1) output shows about 10% fluctuations in the CPU utilization in the USER space while the SYSTEM space variations do not show evident problems. The problem I might trying to address is: The system has been dimensioned to accomodate these fluctuations and hence average load it can handle is low. If the fluctations can be reduced then the total throughput the system can promise can be higher.
The process level information hasn't been useful since prstat(1M) output (and its summation) do not show the fluctuations. While I do not expect the process level tools to provide statistics that add up to sar(1) output, I was hoping for the pattern to occur with prstat(1M) as well. Since number of processes is high, recognizing the process group causing the fluctuations is difficult to figure out. 1. What is the reason for sar(1) showing the fluctuations while prstat(1M) does not? On CMT systems like T2000, should we expect either of the tools to be more accurate than the other? 2. Is the assumption that sar(1) output is true with respect to CPU fluctuations right? If yes what might be the source of the CPU fluctuations? Are there other accounting parameters (thread context switches, kernel threads, interrupts, etc) that sar(1) might be accounting for which summation of prstat doesn't present? 3. Should the problem be approached differently - trying to identify physical core utilization or some such? Mapping the system level information that sar(1) has thrown into actual cause at process(or such) level has been the problem. Looking forward to get some insights. thanks and best regards Shiv _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org