I’m using GRES to manage eight GPUs in a node on a new Slurm cluster and am trying to bind specific CPUs to specific GPUs, but it’s not working as I expected.
I am able to request a specific number of GPUs, but the CPU assignment seems wrong. I assume I’m missing something obvious, but just can't find it. Any suggestion for how to fix it, or how to better investigate the problem, would be much appreciated. Example srun requesting one GPU follows: $ srun -p dgx1 --gres=gpu:1 --pty $SHELL [node-01:~]$ nvidia-smi --query-gpu=index,name --format=csv index, name 0, Tesla V100-SXM2-16GB [node-01:~]$ cat /sys/fs/cgroup/cpuset/slurm/uid_*/job_*/cpuset.cpus 5,45 Similar example requesting eight GPUs follows: $ srun -p dgx1 --gres=gpu:8 --pty $SHELL [node-01:~]$ nvidia-smi --query-gpu=index,name --format=csv index, name 0, Tesla V100-SXM2-16GB 1, Tesla V100-SXM2-16GB 2, Tesla V100-SXM2-16GB 3, Tesla V100-SXM2-16GB 4, Tesla V100-SXM2-16GB 5, Tesla V100-SXM2-16GB 6, Tesla V100-SXM2-16GB 7, Tesla V100-SXM2-16GB [node-01:~]$ cat /sys/fs/cgroup/cpuset/slurm/uid_*/job_*/cpuset.cpus 5,45 The machines are all Ubuntu 16.04 and Slurm version is 17.11.9-2. The /etc/slurm/gres.conf file follows: [node-01:~]$ less /etc/slurm/gres.conf Name=gpu Type=V100 File=/dev/nvidia0 Cores=10-11 Name=gpu Type=V100 File=/dev/nvidia1 Cores=12-13 Name=gpu Type=V100 File=/dev/nvidia2 Cores=14-15 Name=gpu Type=V100 File=/dev/nvidia3 Cores=16-17 Name=gpu Type=V100 File=/dev/nvidia4 Cores=18-19 Name=gpu Type=V100 File=/dev/nvidia5 Cores=20-21 Name=gpu Type=V100 File=/dev/nvidia6 Cores=22-23 Name=gpu Type=V100 File=/dev/nvidia7 Cores=24-25 The /etc/slurm/slurm.conf file on all machines in the cluster follows (with minor cleanup): ClusterName=testcluster ControlMachine=slurm-master SlurmUser=slurm SlurmctldPort=6817 SlurmdPort=6818 AuthType=auth/munge SlurmdSpoolDir=/var/spool/slurm/d SwitchType=switch/none MpiDefault=none SlurmctldPidFile=/var/run/slurmctld.pid SlurmdPidFile=/var/run/slurmd.pid ProctrackType=proctrack/cgroup PluginDir=/usr/lib/slurm ReturnToService=2 Prolog=/etc/slurm/slurm.prolog PrologSlurmctld=/etc/slurm/slurm.ctld.prolog Epilog=/etc/slurm/slurm.epilog EpilogSlurmctld=/etc/slurm/slurm.ctld.epilog TaskProlog=/etc/slurm/slurm.task.prolog TaskPlugin=task/affinity,task/cgroup SlurmctldTimeout=300 SlurmdTimeout=300 InactiveLimit=0 MinJobAge=300 KillWait=20 Waittime=0 SchedulerType=sched/backfill SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory FastSchedule=0 DebugFlags=CPU_Bind,gres SlurmctldDebug=debug5 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmdDebug=3 SlurmdLogFile=/var/log/slurm/slurmd.log JobCompType=jobcomp/filetxt JobCompLoc=/data/slurm/job_completions.log AccountingStorageType=accounting_storage/slurmdbd AccountingStorageLoc=/data/slurm/accounting_storage.log AccountingStorageEnforce=associations,limits,qos AccountingStorageTRES=gres/gpu,gres/gpu:V100 PreemptMode=SUSPEND,GANG PrologFlags=Serial,Alloc RebootProgram="/sbin/shutdown -r 3" PreemptType=preempt/partition_prio CacheGroups=0 DefMemPerCPU=2048 GresTypes=gpu NodeName=node-01 State=UNKNOWN \ Sockets=2 CoresPerSocket=20 ThreadsPerCore=2 \ Gres=gpu:V100:8 PartitionName=all Nodes=node-01 \ Default=YES MaxTime=4:0:0 DefaultTime=4:0:0 State=UP Thanks, Randy