Hi Sushil,
Try changing NodeName specification to:
NodeName=localhost CPUs=96 State=UNKNOWN Gres=gpu*:8*
Also:
TaskPlugin=task/cgroup
Best,
Steve
On Wed, Apr 6, 2022 at 9:56 AM Sushil Mishra
wrote:
> Dear SLURM users,
>
> I am very new to alarm and need some help in configuring slurm in
Hello,
try to comment out the line:
AutoDetect=nvml
And then restart "slurmd" and "slurmctld".
Job allocations to the same GPU might be an effect of automatic MPS
configuration, thogugh I'm not sure for 100%:
https://slurm.schedmd.com/gres.html#MPS_Management
Kind Regards
--
Kamil Wilczek
Dear SLURM users,
I am very new to alarm and need some help in configuring slurm in a single
node machine. This machine has 8x Nvidia GPUs and 96 core cpu. Vendor has
set up a "LocalQ" but thai somehow is running all the calculations in GPU
0. If I submit 4 independent jobs at a time, it starts ru