No, I can't submit more than 7 individual jobs and have them all run, the jobs
after the first 7 will go to pending until the first 7 finish.
And it's not a limit (at least, not of "7"), because here's the same problem
but with a node configured for 2x3g.20gb per card (2 cards, so, 4 total MIG
gpus in the node)
[rug262@testsch (RC) slurm] sinfo -o "%20N %10c %10m %25f %40G "
NODELIST CPUS MEMORY AVAIL_FEATURES GRES
t-gc-1201 48 358400 3gc20gb
gpu:nvidia_a100_3g.20gb:4(S:0)
So, there are 4 of them on that node
[rug262@testsch (RC) slurm] sbatch --gpus=1 --cpus-per-task=2 --partition=debug
--nodelist=t-gc-1201 --wrap="sleep 100"
I submit 3 jobs, each asking for 1 gpu from that node
[rug262@testsch (RC) slurm] squeue
JOBID PARTITION NAME USER ST TIME NODES
NODELIST(REASON)
5049 debug wrap rug262 PD 0:00 1 (Resources)
5048 debug wrap rug262 R 0:09 1 t-gc-1201
5047 debug wrap rug262 R 0:31 1 t-gc-1201
The first 2 go fine, but any after that go to pending, even though there should
be 4 available (according to sinfo output)
Rob
________________________________
From: slurm-users <[email protected]> on behalf of Yair
Yarom <[email protected]>
Sent: Thursday, November 17, 2022 8:19 AM
To: Slurm User Community List <[email protected]>
Subject: Re: [slurm-users] NVIDIA MIG question
You don't often get email from [email protected]. Learn why this is
important<https://aka.ms/LearnAboutSenderIdentification>
Can you request more than 7 single gpu jobs on the same node?
It could be that there's another limit you've encountered (e.g. memory or cpu),
or some other limit (in the account, partition, or qos)
On our setup we're limiting jobs to 1 gpu per job (via partition qos), however
we can use up all the MIGs with single gpu jobs.
On Wed, 16 Nov 2022 at 23:48, Groner, Rob
<[email protected]<mailto:[email protected]>> wrote:
That does help, thanks for the extra info.
If I have two separate GPU cards in the node, and I setup 7 MIGs on each card,
for a total of 14 MIG "gpus" in the node...then, SHOULD I be able to salloc
requesting, say 10 GPUs (7 from 1 card, 3 from the other)? Because I can't.
I can request up to 7 just fine. When I request more than that, it adds in
other nodes to try to give me that, even though there are theoretically 14 on
the one node. When I ask for 8, it gives me 7 from t-gc-1202 and then 1 from
t-gc-1201. When I ask for 10, then it fails because it can't give me 10
without using 2 cards in one node.
[rug262@testsch ~ ]# sinfo -o "%20N %10c %10m %25f %50G "
NODELIST CPUS MEMORY AVAIL_FEATURES GRES
t-gc-1201 48 358400 3gc20gb
gpu:nvidia_a100_3g.20gb:4(S:0)
t-gc-1202 48 358400 1gc5gb
gpu:nvidia_a100_1g.5gb:14(S:0)
[rug262@testsch (RC) ~] salloc --gpus=10 --account=1gc5gb --partition=sla-prio
salloc: Job allocation 5015 has been revoked.
salloc: error: Job submit/allocate failed: Requested node configuration is not
available
Rob
________________________________
From: slurm-users
<[email protected]<mailto:[email protected]>>
on behalf of Yair Yarom <[email protected]<mailto:[email protected]>>
Sent: Wednesday, November 16, 2022 3:48 AM
To: Slurm User Community List
<[email protected]<mailto:[email protected]>>
Subject: Re: [slurm-users] NVIDIA MIG question
You don't often get email from [email protected]<mailto:[email protected]>.
Learn why this is important<https://aka.ms/LearnAboutSenderIdentification>
Hi,
From what we observed, Slurm sees the MIGs each as a distinct gres/gpu. So you
can have 14 jobs each using a different MIG.
However (unless something has changed in the past year), due to nvidia
limitations, a single process can't access more than one MIG simultaneously
(this is unrelated to Slurm). So while you can have a user request a Slurm job
with 2 gpus (MIGs), they'll have to run two distinct processes within that job
in order to utilize those two MIGs.
HTH,
On Tue, 15 Nov 2022 at 23:42, Laurence
<[email protected]<mailto:[email protected]>> wrote:
Hi Rob,
Yes, those questions make sense. From what I understand, MIG should essentially
split the GPU so that they behave as separate cards. Hence two different users
should be able to use two different MIG instances at the same time and also a
single job could use all 14 instances. The result you observed suggests that
MIG is a feature of the driver i.e lspci shows one device but nvidia-smi shows
7 devices.
I haven't played around with this myself in slurm but would be interested to
know the answers.
Laurence
On 15/11/2022 17:46, Groner, Rob wrote:
We have successfully used the nvidia-smi tool to take the 2 A100's in a node
and split them into multiple GPU devices. In one case, we split the 2 GPUS
into 7 MIG devices each, so 14 in that node total, and in the other case, we
split the 2 GPUs into 2 MIG devices each, so 4 total in the node.
From our limited testing so far, and from the "sinfo" output, it appears that
slurm might be considering all of the MIG devices on the node to be in the same
socket (even though the MIG devices come from two separate graphics cards in
the node). The sinfo output says (S:0) after the 14 devices are shown,
indicating they're in socket 0. That seems to be preventing 2 different users
from using MIG devices at the same time. Am I wrong that having 14 MIG gres
devices show up in slurm should mean that, in theory, 14 different users could
use one at the same time?
Even IF that doesn't work....if I have 14 devices spread across 2 physical GPU
cards, can one user utilize all 14 for a single job? I would hope that slurm
would treat each of the MIG devices as its own separate card, which would mean
14 different jobs could run at the same time using their own particular MIG,
right?
Do those questions make sense to anyone? 🙂
Rob
--
/| |
\/ | Yair Yarom | System Group (DevOps)
[] | The Rachel and Selim Benin School
[] /\ | of Computer Science and Engineering
[]//\\/ | The Hebrew University of Jerusalem
[// \\ | T +972-2-5494522 | F +972-2-5494522
// \ | [email protected]<mailto:[email protected]>
// |
--
/| |
\/ | Yair Yarom | System Group (DevOps)
[] | The Rachel and Selim Benin School
[] /\ | of Computer Science and Engineering
[]//\\/ | The Hebrew University of Jerusalem
[// \\ | T +972-2-5494522 | F +972-2-5494522
// \ | [email protected]<mailto:[email protected]>
// |