Hafedh (Professional Services, TC)
Sent: Thursday, January 18, 2024 9:38 AM
To: Slurm User Community List
mailto:slurm-users@lists.schedmd.com>>
Subject: Re: [slurm-users] Need help with running multiple instances/executions
of a batch script in parallel (with NVIDIA HGX A100 GPU as a Gres)
Hi Noam and Matthias,
Thanks both for your answers.
I changed the "#SBATCH --gres=gpu:4" directive (in the batch script) with
"#SBATCH --gres=gpu:1" as you suggested, but it didn't make a difference, as
running this batch script 3 times will result in the first job to be in a
running state, wh
Hello Experts,
I'm a new Slurm user (so please bare with me :) ...).
Recently we've deployed Slurm version 23.11 on a very simple cluster, which
consists of a Master node (acting as a Login & Slurmdbd node as well), a
Compute Node which has a NVIDIA HGX A100-SXM4-40GB GPU, detected as 4 x GPU's