Hi,
We are testing the MIG deployment on our new slurm compute node with 4 x
H100 GPUs. It looks like everything is configured correctly but we have a
problem accessing mig devices. When I submit jobs requesting a mig gpu
device #SBATCH --gres=gpu:H100_1g.10gb:1, the jobs get submitted to the
node,
-
*From:* slurm-users on behalf of Vogt, Timon
*Sent:* Wednesday, July 19, 2023 3:08 PM
*To:* slurm-us...@schedmd.com
*Subject:* [slurm-users] MIG-Slice: Unavailable GRES
Dear Slurm Mailing List,
I am experiencing a problem which affects our cluster and for which I am
completely out of ide
instead and see if it works then. We've used 3g.20gb and
1g.5gb on our nodes and it works fine, never tried 2g.10gb.
Rob
From: slurm-users on behalf of Vogt, Timon
Sent: Wednesday, July 19, 2023 3:08 PM
To: slurm-us...@schedmd.com
Subject: [slurm-users
Dear Slurm Mailing List,
I am experiencing a problem which affects our cluster and for which I am
completely out of ideas by now, so I would like to ask the community for
hints or ideas.
We run a partition on our cluster containing multiple nodes with Nvidia
A100 GPUs (40GB), which we have s