That is because your configuration only lists node0 as the host. You can
only have one slurmctld running at a time, so you can either define
node1 as a backuphost or not bother trying to start slurmctld on it.
Brian Andrus
On 6/28/2019 6:31 AM, Pär Lundö wrote:
Hi all slurm-experts!
Recently I managed to configure and install a version 19.05 of Slurm
in Ubuntu 18.04 and Ubuntu 18.10.
I got it to run on my single node computer (a notebook)
Feeling a bit comfortable with this setup I tried to extrapolate this
to an additional computer, say node1, in my network. I now have two
nodes, node0 and node1. Node0 being the "SlurmctldHost" and node1
being part of a partition. The two nodes have identical copies of
slurm.conf.
However when starting the "slurmctld" and "slurmd" at node1, I receive
errors stating that this host (node1) is not a valid controller.
Both node0 and node1 have copies of /etc/hosts-file.
I can ping both node1 from node0 and node0 from node1.
Nodes have the munge.key, I checked it with the cksum-command.
Performing a manual start of slurmctld with a arguments of "-D
-vvvvv", I receive the same errors as stated by the "systemctl status
slurmctld"-command.
I also recieve an error stating that my MailProg is faulty, however
the "MailProg" in slurm.conf is commented out, and I have no intention
in using one.
I have searched documentation and previous posted question of this,
but have not found a solution.
Any help is much appreciated, thank you!
Best regards,
Palle