[slurm-users] Re: error: Unable to contact slurm controller (connect failure)

2024-11-19 Thread daniel.rodriguez--- via slurm-users
Hi, Thank you all for the early answers. We tried your suggestions and the problem was in the slurm.conf, we did not notice that the name of the control server had a typo. Thank you, I really appreciate the help. Best, Daniel -- slurm-users mailing list -- slurm-users@lists.schedmd.com To un

[slurm-users] Re: error: Unable to contact slurm controller (connect failure)

2024-11-18 Thread Steffen Grunewald via slurm-users
Hi Daniel, >  error: Unable to contact slurm controller (connect failure) > > I appreciate any insight on what could be the cause. Can you check that the slurmctld is up and running, and that the said commands work on the controller machine itself? If the slurmctld cannot be started as a service

[slurm-users] Re: error: Unable to contact slurm controller (connect failure)

2024-11-18 Thread Sid Young via slurm-users
A few things to look at, make sure DNS/Host name resolution works, disable any firewalls for testing, you can lock it down after, make sure the slurm.conf file is the same on all nodes. I've just done a 20.11.9 to 24.05.2 upgrade along with a Centos7.9 to rhel 9.10 upgrade on all my nodes. Sid