Hello all!
we are using slurm 20.11.8 with
SelectType = select/cons_tres
SelectTypeParameters = CR_CORE_MEMORY
and nodes with enable hyperthreading, e.g.
NodeName=slurm-node?? NodeAddr=192.?? Procs=72 Sockets=2 CoresPerSocket=18
ThreadsPerCore=2 RealMemory=...
when launching jobs on these nodes with --cpus-per-task 1 they execute
twice:
> $ srun --cpus-per-task 1 echo foo
> foo
> foo
digging deeper I found
$ srun --cpus-per-task 1 env | grep -i tasks
SLURM_NTASKS=2
SLURM_TASKS_PER_NODE=2
SLURM_STEP_NUM_TASKS=2
SLURM_STEP_TASKS_PER_NODE=2
SLURM_NTASKS=2
SLURM_TASKS_PER_NODE=2
SLURM_STEP_NUM_TASKS=2
SLURM_STEP_TASKS_PER_NODE=2
whereas `scontrol show job 12345 | grep -i -e numtasks -e numcpus` for
both "env" and "echo" job gives
NumNodes=1 NumCPUs=2 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
A test node without ThreadsPerCore=2 behaves "normally".
Also
> $ srun -n1 --cpus-per-task 1 echo foo
> foo
resolves the problem.
This seems like a bug to me.
Can this be reproduced (on newer versions)?
Can this somehow be avoided by setting a default number of tasks or some
other (partition) parameter? Sorry for asking but I couldn't find
anything in the documentation.
Let me know if you need more information.
Best Regards, Benjamin