Jobs ends on the same GPU. If I run CUDA deviceQuery in the sbatch I get:
Device PCI Domain ID / Bus ID / location ID: 0 / 97 / 0 Device PCI Domain ID / Bus ID / location ID: 0 / 97 / 0 Device PCI Domain ID / Bus ID / location ID: 0 / 97 / 0 Device PCI Domain ID / Bus ID / location ID: 0 / 97 / 0 Our cgroup.conf : /etc/slurm/cgroup.conf CgroupAutomount=yes CgroupReleaseAgentDir="/etc/slurm/cgroup" ConstrainCores=yes ConstrainDevices=yes ConstrainRAMSpace=yes Daniel On 23.05.2019 9:54, Aaron Jackson wrote:
Do jobs actually end up on the same GPU though? cgroups will always refer to the first allocated GPU as 0, so it is not unexpected for each job have CUDA_VISIBLE_DEVICES set to 0. Make sure you have the following in /etc/cgroup.conf ConstrainDevices=yes Aaron
smime.p7s
Description: S/MIME Cryptographic Signature