Hi Hafedh,
Your job script has the sbatch directive “—gpus-per-node=4” set. I suspect
that if you look at what’s allocated to the running job by doing “scontrol show
job ” and looking at the TRES field, it’s been allocated 4 GPUs instead
of one.
Regards,
--Troy
From: slurm-us
Requesting --exclusive and then using $SLURM_CPUS_ON_NODE to determine the
number of the tasks or threads to use inside the job script would be my
recommendation.
--Troy
-Original Message-
From: slurm-users On Behalf Of Tina
Friedrich
Sent: Tuesday, March 22, 2022 10:43 AM
To:
My site has just updated to Slurm 21.08 and we are looking at moving to the
built-in job script capture capability, so I'm curious about this as well.
--Troy
-Original Message-
From: slurm-users On Behalf Of Paul
Edmon
Sent: Thursday, December 2, 2021 10:30 AM
To: slurm-users@l
We have developed a set of unit tests based on LuaUnit for our clusters' submit
filters.
--Troy
From: slurm-users On Behalf Of Michael
Robbert
Sent: Thursday, May 6, 2021 1:11 PM
To: Slurm User Community List
Subject: [slurm-users] Testing Lua job submit plugins
I'm wondering
What version of Slurm are you running? I had a problem like this in the
initial 20.02 release that was fixed in 20.02.1.
--Troy
From: slurm-users on behalf of Erik
Bryer
Reply-To: Slurm User Community List
Date: Tuesday, October 27, 2020 at 8:30 PM
To: "slurm-users@lists.sch
I've been looking at it for classroom type reservations, but I ran into a bug
where jobs that weren't eligible to access the reservation were being attracted
to it anyway. That's supposed to be fixed in 20.02.6. See
https://bugs.schedmd.com/show_bug.cgi?id=9593 for details.
--Troy
O
There's an outstanding feature request for that:
https://bugs.schedmd.com/show_bug.cgi?id=8383
While waiting on that, we've taken to injecting it into the job's environment
ourselves in the Lua submit filter.
--Troy
On 7/27/20, 12:45 PM, "slurm-users on behalf of Brian Andrus"
wrote
I don’t think there’s a way to do that in Slurm using just the node
declaration, other than the previously mentioned way of configuring it to show
up as having only 1 core. However, you could put the node in a partition that
has OverSubscribe=EXCLUSIVE set, and have that partition be the only w