This ticket with SchedMD implies it's a munged issue:

https://urldefense.com/v3/__https://bugs.schedmd.com/show_bug.cgi?id=1293__;!!CzAuKJ42GuquVTTmVmPViYEvSg!N2M1a84yfU8mhdQ87LnBMQxye_nBsrTzTow7spIqZaQ2dLevBDZy4oNMT8KzMsmhxdRwchIht3Tgl3p8cMHhFOg9ry546OQ_iA$

Is the munge daemon running on all systems? If it is, are all servers running a 
network time daemon such chronyd or ntpd and the time is in sync on all hosts?
Thanks Mick,

munge is seemingly running on all systems (systemctl status munge).  I do get a 
warning about the munge file changing on disk, but I'm pretty sure that's from 
warewulf sync'ing files every minute.  A sha256sum on the munge.key file on the 
compute nodes and host node says they're the same, so I think I can put that 
aside.

The management node runs chrony and the compute nodes sync to the management 
node.
[root@kirby uber]# chronyc tracking
Reference ID    : 4A06A849 (t2.time.gq1.yahoo.com)
Stratum         : 3
Ref time (UTC)  : Mon Jan 08 22:26:44 2024
System time     : 0.000032525 seconds slow of NTP time
Last offset     : -0.000021390 seconds
RMS offset      : 0.000055729 seconds
Frequency       : 38.797 ppm slow
Residual freq   : +0.001 ppm
Skew            : 0.018 ppm
Root delay      : 0.033342984 seconds
Root dispersion : 0.000524800 seconds
Update interval : 256.8 seconds
Leap status     : Normal

vs
[root@sonic01 ~]# chronyc tracking
Reference ID    : C0A80102 (warewulf)
Stratum         : 4
Ref time (UTC)  : Mon Jan 08 22:31:02 2024
System time     : 0.000000120 seconds slow of NTP time
Last offset     : -0.000000092 seconds
RMS offset      : 0.000014737 seconds
Frequency       : 47.495 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.066 ppm
Root delay      : 0.033458963 seconds
Root dispersion : 0.000283949 seconds
Update interval : 64.2 seconds
Leap status     : Normal

So, the compute node is talking to the host and the host is talking to generic 
NTP sources.  "date" shows the same time on the compute nodes

Reply via email to