Hi, I'm currently playing with SLURM 17.11.7, cgroups and a node with 2 GPUs. Everything works fine if I set the GPU to be consumable. Cgroups are doing their jobs and the right device is allocated to the right job. However, it doesn't work if I set `Gres=gpu:no_consume:2`. For some reason, SLURM doesn't allow access to the devices
Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:0 rwm(/dev/nvidia0) for job Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:1 rwm(/dev/nvidia1) for job Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:0 rwm(/dev/nvidia0) for step Jul 10 13:54:23 imk-dl-01 slurmstepd[2232]: debug: Not allowing access to device c 195:1 rwm(/dev/nvidia1) for step I don't understand why it doesn't work. I'm using nvidia-384 and I can launch multiple processes on a single GPU outside of SLURM. Ideas? Thanks, -F