Hi,

We are running slurm 17.02.7

For some reason, we are observing that the preferred CPUs defined in gres.conf 
for GPU devices are being ignored when running jobs.  That is, in our gres.conf 
we have gpu resource lines, such as:

Name=gpu Type=kepler File=/dev/nvidia0 
CPUs=0,1,2,3,4,5,6,7,16,17,18,19,20,21,22,23

and

Name=gpu Type=kepler File=/dev/nvidia4 
CPUs=8,9,10,11,12,13,14,15,24,25,26,27,28,29,30,31

but when we run a job with the second gpu allocated, 
/sys/fs/cgroup/cpuset/slurm/..../cpuset.cpus reports that the job has been 
allocated cpus from the first gpu's set.  It seems as if the CPU/GPU affinity 
in gres.conf is being completely ignored.  Slurmd.log doesn't seem to mention 
anything about it with maximum debug verbosity.

We have tried the following TaskPlugin settings: "task/affinity,task/cgroup" 
and just "task/cgroup".  In both cases we have tried setting TaskPluginParam to 
"Cpuset".  All of these configurations produced the same incorrect results.

Is there some special configuration that is needed to get CPU/GPU affinity 
through gres.conf to work as described in the documentation?

Thanks

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may 
contain
confidential information.  Any unauthorized review, use, disclosure or 
distribution
is prohibited.  If you are not the intended recipient, please contact the 
sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

Reply via email to