AMD posted an interesting write up discussing some corner cases that operating systems (like Solaris) should consider when using the TSC (Time Stamp Counter) in conjunction with the power management features provided by current Opteron and Athlon 64 processors.
This was posted last Friday to comp.unix.solaris: http://groups.google.com/group/comp.unix.solaris/msg/ce39f46758e5e539? For convenience it's also available at: http://www.opensolaris.org/os/community/performance/technical_docs/amd_tsc_power AMD's write up discusses the effects of several processor power management features on the TSC, and describes three scenarios where TSC drift could occur across multi-processor and/or multi-core configurations. These scenarios are: - P-state changes - C1 state changes - STPCLK throttling TSC clock drift can manifest as a problem where time can periodically appear to "go backwards" when the time reported on one core is greater than the time reported subsequently on a different core. Solaris uses TSC as its primary time source. TSC is preferred as it is both high resolution, and low latency. Of the three drift scenarios, only the C1 state changes, and STPCLK throttling are currently relevant for Solaris, as P-state changes are not supported by the OS at this time. C1-clock ramping ================ As the write up describes, C1-clock ramping is initiated by the OS via the "hlt" instruction. By default, idle CPUs in Solaris will initiate "hlt" when there is no other runnable work available in the system. We have confirmed that TSC drift can happen under Solaris due to C1-clock ramping, but because at present the C1-clock ramping feature is only generally enabled by the system BIOS on uni-processor systems we believe there to be little impact for current multi-processor Opteron systems. Thus far, we have only observed the TSC drift issue manifest on uni-processor, dual core Opteron and Athlon 64 based systems. We believe this is because the BIOS would enable C1-clock ramping for single processor socket systems, and having multiple cores allows the per core TSCs to drift with respect to each other. The short-term code changes suggested by AMD to disable C1-clock ramping on systems that could exhibit TSC drift will be available shortly in a Solaris 10 kernel update patch (as of x86 kernel patch 118844 revision 23 or later). The Sun bug ID tracking this issue is: 6336786 time doesn't fly when CPUs are not having fun Pending patch availability, customers with uni-processor, dual-core Opteron/Athlon 64 systems running Solaris are advised to disable idle CPU halting by adding the following line to /etc/system: set halt_idle_cpus=0 ... and rebooting. Please don't forget to remove this line once the kernel patch (118844-23 or better) with the fix for 6336786 is applied. I'll be following up this post up with Nevada/OpenSolaris diffs (or a webrev) with the fix for 6336786. (Probably tomorrow) STPCLK throttling ================= Work to prevent TSC drift under Solaris in the face of STPCLK throttling is underway. Because current usage of the STPCLK throttling feature is not widespread, and because (where used) it would be a rare platform event, we expect the current likelihood of encountering STPCLK throttling induced TSC drift under Solaris to be small. -Eric This message posted from opensolaris.org _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org