Thank you all for the early answers. We tried your suggestions and the problem
was in the slurm.conf, we did not notice that the name of the control server
had a typo.
Thank you, I really appreciate the help.
slurm-users mailing list -- slurm-users@lists.schedmd.com
To un
Hi Daniel,
> error: Unable to contact slurm controller (connect failure)
> I appreciate any insight on what could be the cause.
Can you check that the slurmctld is up and running, and that the said
commands work on the controller machine itself?
If the slurmctld cannot be started as a service
A few things to look at, make sure DNS/Host name resolution works, disable
any firewalls for testing, you can lock it down after, make sure the
slurm.conf file is the same on all nodes.
I've just done a 20.11.9 to 24.05.2 upgrade along with a Centos7.9 to rhel
9.10 upgrade on all my nodes.