On 7/14/23 1:10 pm, Wilson, Steven M wrote:
It's not so much whether a job may or may not access the GPU but rather
which GPU(s) is(are) included in $CUDA_VISIBLE_DEVICES. That is what
controls what our CUDA jobs can see and therefore use (within any
cgroups constraints, of course). In my case
ly 18, 2023 5:32 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Unconfigured GPUs being allocated
Further testing and looking at the source code confirms what looks to me like a
bug in Slurm. GPUs that are not configured in gres.conf are detected by slurmd
in the system and disc
eem to have an effect upon the actual
environment of the job.
Steve
From: Wilson, Steven M
Sent: Friday, July 14, 2023 4:10 PM
To: slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Unconfigured GPUs being allocated
It's not so much whether a job may o
___
From: slurm-users on behalf of Feng
Zhang
Sent: Friday, July 14, 2023 3:09 PM
To: Slurm User Community List
Subject: Re: [slurm-users] Unconfigured GPUs being allocated
[Some people who received this message don't often get email from
prod.f...@gmail.c
slurm-users@lists.schedmd.com
Subject: Re: [slurm-users] Unconfigured GPUs being allocated
[You don't often get email from ch...@csamuel.org. Learn why this is important
at https://aka.ms/LearnAboutSenderIdentification ]
External Email: Use caution with attachments, links, or sh
Very interesting issue.
I am guessing there might be a workaround: SInce oryx has 2 gpus
instead, you can define both of them, but disable the GT 710? Does
Slurm support this?
Best,
Feng
Best,
Feng
On Tue, Jun 27, 2023 at 9:54 AM Wilson, Steven M wrote:
>
> Hi,
>
> I manually configure the
On 7/14/23 10:20 am, Wilson, Steven M wrote:
I upgraded Slurm to 23.02.3 but I'm still running into the same problem.
Unconfigured GPUs (those absent from gres.conf and slurm.conf) are still
being made available to jobs so we end up with compute jobs being run on
GPUs which should only be used
I upgraded Slurm to 23.02.3 but I'm still running into the same problem.
Unconfigured GPUs (those absent from gres.conf and slurm.conf) are still being
made available to jobs so we end up with compute jobs being run on GPUs which
should only be used
Any ideas?
Thanks,
Steve
___
Hi,
I manually configure the GPUs in our Slurm configuration (AutoDetect=off in
gres.conf) and everything works fine when all the GPUs in a node are configured
in gres.conf and available to Slurm. But we have some nodes where a GPU is
reserved for running the display and is specifically not co