I have logs going back to 2019-10-26. These clock fuzz errors started on ntp1 on 2019-11-02 and ntp2 on 2019-11-21.
On 2019-11-02, I upgraded to NTPsec 1.1.7 (from 1.1.3) and enabled NTS (as both a client and server). On 2019-11-08, I added the GPS to ntp2. Based on the dates, that seems unrelated. On 2019-11-21 on ntp2, I was performing debugging as discussed earlier in the thread. This involved a reboot. This is probably when it moved to Linux 4.15.0-70-generic (that's from the Ubuntu package), from likely 4.15.0-45-generic. That also seems unrelated, though, as ntp1 is still running 4.15.0-45-generic and has not been rebooted since 2019-09-28. Trying again with NTPsec 1.1.3 seems like a useful next step. If that is good, then I need to bisect the difference. On 11/25/19 11:46 AM, Achim Gratz via devel wrote: > Richard Laager via devel writes: >> These both have the following CPU (which is older): >> Intel(R) Xeon(R) CPU X5460 @ 3.16GHz > > These may not yet have consistent TSC between cores/sockets (or require > BIOS tweaks for that). /proc/cpu says constant_tsc, but that's it (besides "tsc", of course). That is, I do _not_ have nonstop_tsc, so therefore I presume I do not have the "invariant TSC" CPU feature. Any thoughts on what to look for in the BIOS? I poked around, but there didn't seem much related. There was a "Clock Spectrum Feature", which I assume is something about spread spectrum, which is disabled. The HPET is enabled. The Intel EIST setting is set to disabled, which the help text says disables C-states. Should I consider trying the HPET as the kernel clocksource? $ cat /sys/devices/system/clocksource/clocksource0/available_clocksource tsc hpet acpi_pm $ cat /sys/devices/system/clocksource/clocksource0/current_clocksource tsc I only have the clock fuzzing errors on my NTP servers. I don't have an exact matching configuration that's not an NTP server, but: Similar hardware running Debian 10 and ntpsec 1.1.3 does not. Two eras of newer hardware running Ubuntu 18.04 and the same ntpsec do not. I tried disabling the second CPU socket. I verified that I'm down from 8 cores to 4 cores. No change to the CLOCK_MONOTONIC_RAW performance: rlaager@ntp2:~$ ./a.out res avg min dups CLOCK 1 28 26 CLOCK_REALTIME 4000000 8 3999658 -1 CLOCK_REALTIME_COARSE 1 28 26 CLOCK_MONOTONIC 1 374 362 CLOCK_MONOTONIC_RAW 1 383 371 CLOCK_BOOTTIME Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples. ns hits 26 50500 27 531901 28 49137 29 914 30 367497 33 2 34 1 36 5 39 1 60 1 41 samples were bigger than 60. rlaager@ntp2:~$ ./a.out res avg min dups CLOCK 1 29 26 CLOCK_REALTIME 4000000 8 3999852 -3 CLOCK_REALTIME_COARSE 1 28 26 CLOCK_MONOTONIC 1 375 362 CLOCK_MONOTONIC_RAW 1 384 372 CLOCK_BOOTTIME Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples. ns hits 26 50139 27 531774 28 49516 29 397 30 367973 32 1 36 1 48 1 63 2 66 2 194 samples were bigger than 66. rlaager@ntp2:~$ ./a.out res avg min dups CLOCK 1 28 26 CLOCK_REALTIME 4000000 8 3999859 -3 CLOCK_REALTIME_COARSE 1 28 26 CLOCK_MONOTONIC 1 374 366 CLOCK_MONOTONIC_RAW 1 385 374 CLOCK_BOOTTIME Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples. ns hits 26 49328 27 523261 28 48763 29 360 30 378133 33 2 36 4 39 2 40 1 45 2 144 samples were bigger than 45. -- Richard
signature.asc
Description: OpenPGP digital signature
_______________________________________________ devel mailing list devel@ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel