On Tue, 2019-06-25 at 16:32 +0200, Valerio Bellizzomi wrote: > On Tue, 2019-06-25 at 08:48 -0400, Eli V wrote: > > My first guess would be that the host is not listed as one of the two > > controllers in the slurm.conf. Also, keep in mind munge, and thus > > slurm is very sensitive to lack of clock synchronization between > > nodes. FYI, I run a hand built slurm 18.08.07 on debian 8 & 9 without > > issues. Haven't tried 10 yet. > > I have discovered that Slurm is also sensitive to computer names. > The controller was listed but with a dot and a domain name, I have > removed the dot and domain name and resolved. > > Now I have another problem, the slurmd on the compute node refuses to > connect to the controller with this error: Protocol authentication error
The exact error on the controller is "Invalid credentials", I have copied the munge.key on both hosts but the error persists. > > > > > > > On Tue, Jun 25, 2019 at 1:50 AM Valerio Bellizzomi <vale...@selnet.org> > > wrote: > > > > > > I have installed slurmctld on Debian Testing, trying to start the daemon > > > by hand: > > > > > > > > > > > > # /usr/sbin/slurmctld -D -v -f /etc/slurm-llnl/slurm.conf > > > slurmctld: error: High latency for 1000 calls to gettimeofday(): 2072 > > > microseconds > > > slurmctld: pidfile not locked, assuming no running daemon > > > slurmctld: slurmctld version 18.08.5-2 started on cluster selroc > > > slurmctld: Munge cryptographic signature plugin loaded > > > slurmctld: error: This host (master02/master02) not a valid controller > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > >