Hi Xaver,
what kind of SelectType are you using in your slurm.conf?
Per https://slurm.schedmd.com/gres.html you have to consider:
"As for the --gpu* option, these options are only supported by Slurm's
select/cons_tres plugin."
So you can use "--gpus ..." only when you state
SelectType = select/cons_tres
in your slurm.conf.
But "--gres=gpu:1" should work always.
Regards
Hermann
On 7/17/23 13:43, Xaver Stiensmeier wrote:
Hey,
I am currently trying to understand how I can schedule a job that needs
a GPU.
I read about GRES https://slurm.schedmd.com/gres.html and tried to use:
GresTypes=gpu
NodeName=test Gres=gpu:1
But calling - after a 'sudo scontrol reconfigure':
srun --gpus 1 hostname
didn't work:
srun: error: Unable to allocate resources: Invalid generic resource (gres)
specification
so I read more https://slurm.schedmd.com/gres.conf.html but that didn't
really help me.
I am rather confused. GRES claims to be generic resources but then it
comes with three defined resources (GPU, MPS, MIG) and using one of
those didn't work in my case.
Obviously, I am misunderstanding something, but I am unsure where to look.
Best regards,
Xaver Stiensmeier