subject:"\[slurm\-users\] NVIDIA MIG question"

Re: [slurm-users] NVIDIA MIG question

2022-11-17 Thread Groner, Rob

just 1 gpu, without them going to pending (until all gpus are used up). Rob From: slurm-users on behalf of Groner, Rob Sent: Thursday, November 17, 2022 10:08 AM To: Slurm User Community List Subject: Re: [slurm-users] NVIDIA MIG question No, I can't s

Re: [slurm-users] NVIDIA MIG question

2022-11-17 Thread Groner, Rob

The first 2 go fine, but any after that go to pending, even though there should be 4 available (according to sinfo output) Rob From: slurm-users on behalf of Yair Yarom Sent: Thursday, November 17, 2022 8:19 AM To: Slurm User Community List Subject: Re: [slurm-us

Re: [slurm-users] NVIDIA MIG question

2022-11-17 Thread Yair Yarom

0 --account=1gc5gb > --partition=sla-prio > salloc: Job allocation 5015 has been revoked. > salloc: error: Job submit/allocate failed: Requested node configuration is > not available > > > Rob > > -- > *From:* slurm-users on behalf of > Ya

Re: [slurm-users] NVIDIA MIG question

2022-11-16 Thread Groner, Rob

ration is not available Rob From: slurm-users on behalf of Yair Yarom Sent: Wednesday, November 16, 2022 3:48 AM To: Slurm User Community List Subject: Re: [slurm-users] NVIDIA MIG question You don't often get email from ir...@cs.huji.ac.il. Learn wh

Re: [slurm-users] NVIDIA MIG question

2022-11-16 Thread Yair Yarom

Hi, >From what we observed, Slurm sees the MIGs each as a distinct gres/gpu. So you can have 14 jobs each using a different MIG. However (unless something has changed in the past year), due to nvidia limitations, a single process can't access more than one MIG simultaneously (this is unrelated to

Re: [slurm-users] NVIDIA MIG question

2022-11-15 Thread Laurence

Hi Rob, Yes, those questions make sense. From what I understand, MIG should essentially split the GPU so that they behave as separate cards. Hence two different users should be able to use two different MIG instances at the same time and also a single job could use all 14 instances. The resu

[slurm-users] NVIDIA MIG question

2022-11-15 Thread Groner, Rob

We have successfully used the nvidia-smi tool to take the 2 A100's in a node and split them into multiple GPU devices. In one case, we split the 2 GPUS into 7 MIG devices each, so 14 in that node total, and in the other case, we split the 2 GPUs into 2 MIG devices each, so 4 total in the node.

Re: [slurm-users] NVIDIA MIG question

Re: [slurm-users] NVIDIA MIG question

Re: [slurm-users] NVIDIA MIG question

Re: [slurm-users] NVIDIA MIG question

Re: [slurm-users] NVIDIA MIG question

Re: [slurm-users] NVIDIA MIG question

[slurm-users] NVIDIA MIG question

7 matches

Site Navigation

Mail list logo

Footer information