Hello, I am operating a small cluster of 8 nodes that have 20 cores (2 10-core cpus) and 2 GPUs each (Nvidia K80). To date, I have been successfully running CUDA code where I typically submit single-cpu single-gpu jobs to nodes via slurm with the cons_res and CR_CPU options.
More recently, I have been trying to use multiple MPI threads to access the a single GPU. The issue I am experiencing is that CUDA appears to reserve all available memory (system + GPU) for each MPI thread: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 95292 jan 20 0 1115m 14m 9872 R 100.5 0.0 0:02.07 prjmh 95295 jan 20 0 26.3g 145m 95m R 100.5 0.5 0:01.81 prjmh 95293 jan 20 0 26.3g 145m 95m R 98.6 0.5 0:01.80 prjmh 95294 jan 20 0 26.3g 145m 95m R 98.6 0.5 0:01.81 prjmh Note: PID 95292 is the master which does not access the GPU. The other three processes access the GPU. This results in slurm killing the job: slurmstepd: Exceeded job memory limit slurmstepd: Step 5705.0 exceeded virtual memory limit (83806300 > 29491200), being killed slurmstepd: Step 5705.0 exceeded virtual memory limit (83806300 > 29491200), being killed slurmstepd: Exceeded job memory limit slurmstepd: Exceeded job memory limit slurmstepd: Exceeded job memory limit srun: got SIGCONT slurmstepd: *** JOB 5705 CANCELLED AT 2018-02-04T13:47:00 *** on compute-0-3 srun: forcing job termination srun: error: compute-0-3: task 0: Killed srun: error: compute-0-3: tasks 1-3: Killed Note: When I log into the node and manually run the program with mpirun -np=20 a.out, it runs without issues. Is there a way to change the configuration of slurm so it does not kill these jobs? I have read through the documentation to some extend but my limited slurm knowledge did not allow me to find a solution. Thanks very much, Jan