On Thu, Oct 19, 2017 at 3:14 AM, Steffen Grunewald <steffen.grunew...@aei.mpg.de> wrote: >> for some reason on an empty cluster when i spin up a large job it's >> staggering the allocation across a seemingly random allocation of >> nodes > > Have you looked into topology? With topology.conf, you may group nodes > by (virtually or really, Slurm doesn't check nor care) connecting them > to network switches... adding some "locality" to your cluster setup
yes, i have a topology file defined based on output from ibslurmtopology.sh linked from the schedmd website >> we're using backfill/cons_res + gres, and all the nodes are identical. > > Why do you care about the randomness then? because I do. and further because slurm is skipping nodes for a reason despite my topology file and i'd like to understand why