Dear slurm community,
I am encountering a unique situation where I need to allocate jobs to nodes with different numbers of CPU cores. For instance: node01: Xeon 6226 32 cores node02: EPYC 7543 64 cores $ salloc --partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=32 --comment=etc If --ntasks-per-node is larger than 32, the job could not be allocated since node01 has only 32 cores. In the context of NVIDIA's HPL container, we need to pin MPI processes according to NUMA affinity for best performance. For HGX-1, there are 8 A100s having affinity with 1st, 3rd, 5th, and 7th NUMA domain, respectively. With --ntasks-per-node=32, only the first half of EPYC's NUMA domain is available, and we had to assign the 4-7th A100 to 0th and 2nd NUMA domain, leading to some performance degradation. I am looking for a way to request more tasks than the number of physically available cores, i.e. $ salloc --partition=all --nodes=2 --nodelist=gpu01,gpu02 --ntasks-per-node=64 --comment=etc Your suggestions are much appreciated. Regards, Viet-Duc