AMD posted an interesting write up discussing some corner cases that
operating systems (like Solaris) should consider when using the TSC
(Time Stamp Counter) in conjunction with the power management features
provided by current Opteron and Athlon 64 processors.

This was posted last Friday to comp.unix.solaris:
http://groups.google.com/group/comp.unix.solaris/msg/ce39f46758e5e539?

For convenience it's also available at:
http://www.opensolaris.org/os/community/performance/technical_docs/amd_tsc_power

AMD's write up discusses the effects of several processor power
management features on the TSC, and describes three scenarios
where TSC drift could occur across multi-processor and/or
multi-core configurations. These scenarios are:
        - P-state changes
        - C1 state changes
        - STPCLK throttling

TSC clock drift can manifest as a problem where time can
periodically appear to "go backwards" when the time reported
on one core is greater than the time reported subsequently on
a different core.

Solaris uses TSC as its primary time source. TSC is preferred
as it is both high resolution, and low latency. Of the three
drift scenarios, only the C1 state changes, and STPCLK
throttling are currently relevant for Solaris, as P-state
changes are not supported by the OS at this time.

C1-clock ramping
================

As the write up describes, C1-clock ramping is initiated by the
OS via the "hlt" instruction. By default, idle CPUs in Solaris
will initiate "hlt" when there is no other runnable work available
in the system.

We have confirmed that TSC drift can happen under Solaris due to
C1-clock ramping, but because at present the C1-clock ramping
feature is only generally enabled by the system BIOS on
uni-processor systems we believe there to be little impact for
current multi-processor Opteron systems.

Thus far, we have only observed the TSC drift issue manifest
on uni-processor, dual core Opteron and Athlon 64 based systems.
We believe this is because the BIOS would enable C1-clock ramping
for single processor socket systems, and having multiple cores
allows the per core TSCs to drift with respect to each other.

The short-term code changes suggested by AMD to disable C1-clock
ramping on systems that could exhibit TSC drift will be available
shortly in a Solaris 10 kernel update patch (as of x86 kernel
patch 118844 revision 23 or later).

The Sun bug ID tracking this issue is:
        6336786 time doesn't fly when CPUs are not having fun

Pending patch availability, customers with uni-processor, dual-core
Opteron/Athlon 64 systems running Solaris are advised to disable
idle CPU halting by adding the following line to /etc/system:

set halt_idle_cpus=0

... and rebooting. Please don't forget to remove this line once
the kernel patch (118844-23 or better) with the fix for 6336786 is
applied.

I'll be following up this post up with Nevada/OpenSolaris diffs (or
a webrev) with the fix for 6336786. (Probably tomorrow)


STPCLK throttling
=================

Work to prevent TSC drift under Solaris in the face of STPCLK
throttling is underway. Because current usage of the STPCLK
throttling feature is not widespread, and because (where used)
it would be a rare platform event, we expect the current likelihood
of encountering STPCLK throttling induced TSC drift under Solaris
to be small.

-Eric
This message posted from opensolaris.org
_______________________________________________
perf-discuss mailing list
perf-discuss@opensolaris.org

Reply via email to