Hi, I am currently encountering an issue with Slurm's GPU resource limitation. I have attempted to restrict the number of GPUs a user can utilize by executing the following command: sacctmgr modify user lyz set MaxTRES=gres/gpu=2 This command is intended to limit user 'lyz' to using a maximum of 2 GPUs. However, when the user submits a job using srun, specifying CUDA 0, 1, 2, and 3 in the job script, or os.environ["CUDA_VISIBLE_DEVICES"] = "0,1,2,3", the job still utilizes all 4 GPUs during execution. This indicates that the GPU usage limit is not being enforced as expected. How can I resolve this situation.
-- slurm-users mailing list -- slurm-users@lists.schedmd.com To unsubscribe send an email to slurm-users-le...@lists.schedmd.com