The deeper I dig at the select/cons_res plugin, the more of a mess it appears to be. Inconsistencies with the documentations, etc.
The primary issue seems to be with the select/cons_res node selection lacking "--ntasks-per-node" et al. By default, the algorithm selects "--nodes=N" nodes, then packs "--ntasks=n" onto the nodes starting at the first selected one. The algorithm ensures that at least 1 task will be on every node. The packing is naturally influenced by how many cores are unused on each of the selected nodes. This leads to the "--distribution=plane=X" option being useless. If I ask for: --nodes=2 --ntasks=8 --distribution=plane=3 the resulting allocation is SLURM_NNODES=2 SLURM_DIST_PLANESIZE=3 SLURM_NTASKS=8 SLURM_TASKS_PER_NODE=7,1 which isn't remotely what the "plane" option claims to do. Doing "cyclic" or "block" yields exactly the same behavior. So the default behavior leaves all distribution choices (in terms of node and core selection) indistinguishable: [frey@login ~]$ sbatch --nodes=2 --ntasks=8 --distribution=cyclic test.sh Submitted batch job 558 [frey@login ~]$ sbatch --nodes=2 --ntasks=8 --distribution=block test.sh Submitted batch job 559 [frey@login ~]$ sbatch --nodes=2 --ntasks=8 --distribution=plane=3 test.sh Submitted batch job 560 [frey@login ~]$ grep SLURM_TASKS_PER_NODE slurm-5* slurm-558.out:SLURM_TASKS_PER_NODE=7,1 slurm-559.out:SLURM_TASKS_PER_NODE=7,1 slurm-560.out:SLURM_TASKS_PER_NODE=7,1 In poking through the source code, though, the "SPREAD_JOB" option leads to the use of an alternate algorithm more in line with your and my expectations. The sbatch man page isn't 100% clear what the "--spread-job" option will do (sounds like it will spread the job across the whole partition of nodes) but it turns out to honor the "--nodes=N" that was specified. So submitting using --nodes=2 --ntasks=8 --distribution=plane=3 --spread-job yields the asymmetric task distribution the "plane=3" option _should_ create under the task and node count in question: SLURM_NNODES=2 SLURM_JOBID=557 SLURM_DIST_PLANESIZE=3 SLURM_NTASKS=8 SLURM_TASKS_PER_NODE=5,3 Likewise, for the "cyclic" and "block" distribution options, including "--spread-job" option yields SLURM_TASKS_PER_NODE of 4(x2). > On Oct 17, 2017, at 02:49 , sysadmin.caos <sysadmin.c...@uab.cat> wrote: > > If I run with "--ntasks-per-node=6", result is: > Process 0 on clus01.hpc.local out of 12 > Process 1 on clus02.hpc.local out of 12 > Process 2 on clus01.hpc.local out of 12 > Process 3 on clus02.hpc.local out of 12 > Process 4 on clus01.hpc.local out of 12 > Process 5 on clus02.hpc.local out of 12 > Process 6 on clus01.hpc.local out of 12 > Process 7 on clus02.hpc.local out of 12 > Process 8 on clus01.hpc.local out of 12 > Process 9 on clus02.hpc.local out of 12 > Process 10 on clus01.hpc.local out of 12 > Process 11 on clus02.hpc.local out of 12 > so it's correct... but... you could suppose you don't know how many cores > each node has, so maybe, cluster nodes have 24 cores. Then, do you must > explicitly divide number of tasks between number of nodes inside your script > for assigning the correct value to "--ntasks-per-node" in the "srun" command? > Is not there an automatic way for allocating in a cyclic distribution? > > Thanks. > > El 16/10/2017 a las 16:11, Jeffrey T Frey escribió: >>> If, now, I submit with "sbatch --distribution=cyclic -N 2 -n 12 >>> ./test-new.sh", what I get is: >>> Process 0 on clus01.hpc.local out of 12 >>> Process 1 on clus02.hpc.local out of 12 >>> Process 2 on clus01.hpc.local out of 12 >>> Process 3 on clus01.hpc.local out of 12 >>> Process 4 on clus01.hpc.local out of 12 >>> Process 5 on clus01.hpc.local out of 12 >>> Process 6 on clus01.hpc.local out of 12 >>> Process 7 on clus01.hpc.local out of 12 >>> Process 8 on clus01.hpc.local out of 12 >>> Process 9 on clus01.hpc.local out of 12 >>> Process 10 on clus01.hpc.local out of 12 >>> Process 11 on clus01.hpc.local out of 12 >>> ...but I was expecting another result... Something like this: >>> Process 0 on clus01.hpc.local out of 12 >>> Process 1 on clus02.hpc.local out of 12 >>> Process 2 on clus01.hpc.local out of 12 >>> Process 3 on clus02.hpc.local out of 12 >>> Process 4 on clus01.hpc.local out of 12 >>> Process 5 on clus02.hpc.local out of 12 >>> Process 6 on clus01.hpc.local out of 12 >>> Process 7 on clus02.hpc.local out of 12 >>> Process 8 on clus01.hpc.local out of 12 >>> Process 9 on clus02.hpc.local out of 12 >>> Process 10 on clus01.hpc.local out of 12 >>> Process 11 on clus02.hpc.local out of 12 >>> because I'm forcing a cyclic distribution. Where is the problem? >>> >> >> Have you tried this same thing but with tasks-per-node specified -- be as >> explicit as possible about how many you want placed on each node? E.g. >> >> >> sbatch -N 2 -n 12 --ntasks-per-node=6 --distribution=cyclic >> ./test-new.sh >> >