Listing the hosts separately does what I want. There was nothing wrong with the individual hostnames. The comma separated list syntax just doesn't work.
-- Gary On 12/14/23, 7:14 AM, "slurm-users on behalf of Jens Elkner" <slurm-users-boun...@lists.schedmd.com <mailto:slurm-users-boun...@lists.schedmd.com> on behalf of jel+sl...@cs.ovgu.de <mailto:jel+sl...@cs.ovgu.de>> wrote: APL external email warning: Verify sender slurm-users-boun...@lists.schedmd.com <mailto:slurm-users-boun...@lists.schedmd.com> before clicking links or attachments On Wed, Dec 13, 2023 at 08:16:39PM +0000, Jackson, Gary L. wrote: Hi Gary, > The SlurmctldHost value is set like the following in my slurm.conf: > > SlurmctldHost=host0,host1 > > That seems to be legal according to the documentation. However, I get error > messages like the following: > > $ srun id > > srun: error: get_addr_info: getaddrinfo() failed: Name or service not known > srun: error: slurm_set_addr: Unable to resolve "host0,host1" > srun: error: Unable to establish control machine address > srun: error: Unable to allocate resources: Address already in use ... > What’s going on? Not sure, but I've seen such errors, when using a node name, which was not "registered" via NodeName or discovered otherwise - a code lookup at this time revealed, that the message is IMHO misleading: slurm does __not__ make a DNS lookup - it simply greps its internal list of known nodes and if not found, it emits such messages. Other options: try to use SlurmctldHost=... for each host on a single line to rule out a format errors. Not sure, whether it supports ranges, too (like SlurmctldHost=host[0-1]) , Last but not least 'Address already in use' - checking, whether there is not an instance or something else already listening on the related port shouldn't hurt ... Have fun, jel. -- Otto-von-Guericke University http://www.cs.uni-magdeburg.de/ <http://www.cs.uni-magdeburg.de/> Department of Computer Science Geb. 29 R 027, Universitaetsplatz 2 39106 Magdeburg, Germany Tel: +49 391 67 52768
smime.p7s
Description: S/MIME cryptographic signature