On Tue, Dec 11, 2007 at 03:10:52AM +0000, Jason George wrote:

> I have an older dual P3-800 Compaq server that started losing time very 
> rapidly after upgrading to a -CURRENT snapshot over the weekend.
> 
> Running anything that threw the machine into a high (>10% sustained) system 
> percentage caused the clock to lose an incredible amount of time... to the 
> tune of 30 minutes in the course of about 4 hours.  My specific repeatable 
> test was to run "sup" with the compress option turned on.
> 
> Once I stopped the userland program that was causing the kernel to have high 
> system load, the clock would slowly start to try to readjust itself.  
> Ultimately, it would get within a second but would never fully sync.
> 
> Has anyone else seen this behaviour?
> 
> (Theo and Otto are aware...)

Yes, but to makes things clear, this is a different problem. 

The first problem is: some clocks have a large systematic drift. That
can have several reasons: from cheap hardware to hard/firmware
reporting the wrong timecounter source frequency (we have seen that at
least one macppc, but machines of other platforms might have the same
problem). Current ntpd can only handle drifts up to +/- 500ppm for
timecounter archs.

The second problem is a drift _depending_ on system load. This is
likely to be a problem with interrupts or scheduling. Both underwent
large changes in current, and the developers in this area should/are
picking this up. 

Oh, and ntpd without -s was never designed to adjust for more than a
couple of minutes of offset. So if your clock is way off and yo do not
want or can not use -s , use rdate once to get close and then let ntpd
do its work. 

        -Otto

Reply via email to