Hi, We are running slurm 17.02.7
For some reason, we are observing that the preferred CPUs defined in gres.conf for GPU devices are being ignored when running jobs. That is, in our gres.conf we have gpu resource lines, such as: Name=gpu Type=kepler File=/dev/nvidia0 CPUs=0,1,2,3,4,5,6,7,16,17,18,19,20,21,22,23 and Name=gpu Type=kepler File=/dev/nvidia4 CPUs=8,9,10,11,12,13,14,15,24,25,26,27,28,29,30,31 but when we run a job with the second gpu allocated, /sys/fs/cgroup/cpuset/slurm/..../cpuset.cpus reports that the job has been allocated cpus from the first gpu's set. It seems as if the CPU/GPU affinity in gres.conf is being completely ignored. Slurmd.log doesn't seem to mention anything about it with maximum debug verbosity. We have tried the following TaskPlugin settings: "task/affinity,task/cgroup" and just "task/cgroup". In both cases we have tried setting TaskPluginParam to "Cpuset". All of these configurations produced the same incorrect results. Is there some special configuration that is needed to get CPU/GPU affinity through gres.conf to work as described in the documentation? Thanks ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------