On Thu, 2025-01-09 at 07:51:40 -0500, Slurm users wrote: > Hello there and good morning from Baltimore. > > I have a small cluster with 100 nodes. When the cluster is completely empty > of all jobs, the first job gets allocated to node 41. In other clusters, > the first job gets allocated to mode 01. If I specify node 01, the > allocation works perfectly. I have my partition NodeName set as > node[01-99], so having node41 used first is a surprise to me. We also have > many other partitions which start with node41, but the partition being used > for the allocation starts with node01. > > Does anyone know what would cause this?
Just a wild guess, but do you have a topology.conf file that somehow makes this node look most reasonable to use for a single-node job? (Topology attempts to assign, or hold back, sections of your network to maximize interconnect bandwidth for multi-node jobs. Your node41 might be one - or the first one of a series - that would leave bigger chunks unused for bigger tasks.) HTH, Steffen -- Steffen Grunewald, Cluster Administrator Max Planck Institute for Gravitational Physics (Albert Einstein Institute) Am Mühlenberg 1 * D-14476 Potsdam-Golm * Germany ~~~ Fon: +49-331-567 7274 Mail: steffen.grunewald(at)aei.mpg.de ~~~ -- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com