Yo Hal! On Tue, 31 Jan 2017 15:16:42 -0800 Hal Murray <hmur...@megapathdsl.net> wrote:
> > There was a time I used to buy and provision 10 identical servers > > at a time. Of the 10, one often had this problem. The fix was > > chrony. > > If you ever get another system where ntpd doesn't work but chrony > does, I'd like to investigate. Unlikely, I have not done bulk server buys in 2 years. > > I think this is clearly an ntpd bug. > > I think it's more complicated than that. Of course it is, until we figure out how simple it really is. > There are long standing occasional reports of ntp classic getting > stuck with a drift of 500 and not being able to recover by itself. > It does recover if you delete the drift file and restart ntpd. I > assume our code will do the same thing. I did this on new servers. So no existing drift file. > I think your "gives up just before" is off by quite a bit. No, A/B testing with chronyd and ntpd showed that ntpd only had to bend the clokc a bit more, like chronyd would do, to lock in. I don't recall the exact deatils, but the crystal was way slow. > ntpd has a limit of 500 ppm. The kernel may have a similar limit. I > haven't dived into that corner of the kernel source yet. A limit may > be a good idea to prevent the system getting into strange modes where > it jumps around rather than converging smoothly. I'm not enough of a > PLL geek to explain that area but I know that it's easy to get into > oscillations. If chronyd can do it, ntpd should be able to do it. > I hacked our code to have a limit of 2500. It didn't work. Somebody > claimed it was running at 500 ppm. I don't know if that's the kernel > or some corner of our code that I didn't catch. (I think there are > also a few lines of code in libc.) Not tried chronyd yet? > I think it's reasonable for ntpd to expect the kernel clock to be > reasonably close. I don't. > We could have an interesting discussion about how > close is reasonable, but most systems get well within 500. We are not talking about most, we are talking about the outliers. > If not, > it's usually a bug in the setup/calibration chain someplace. If chronyd can do it, ntpd should be able to do it. > > So what is your 'adjtimex -p' output? > mode: 0 > offset: -6106235 The adjtimex man page says that offset is beyond the range it will accept drom the command line: --offset adj adj must be in the range -512000...512000. > frequency: 565962 > maxerror: 52530 > esterror: 1820 > status: 8193 > time_constant: 6 > precision: 1 > tolerance: 32768000 > tick: 10029 > raw time: 1485903664s 513432740us = 1485903664.513432740 This is from a RasPi 3: > I'm not familiar with adjtimex. I installed in a while ago. The > installation ran it. Better get familiar with it. Is is just a command line shim to the same syscall that ntpd uses to set your clock. ntp_adjtimex() or adjtimex() > Comparing clocks (this will take 70 sec)...done. > Adjusting system time by 251.512 sec/day to agree with CMOS > clock...done. > > That's 2911 ppm. > > Looks like ntpd is happy now. And what was the magic incantation? Can you reset and make ntpd fail again? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 g...@rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can’t measure it, you can’t improve it." - Lord Kelvin
pgp4g904Hmd7F.pgp
Description: OpenPGP digital signature
_______________________________________________ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel