I am facing the same problem that was quoted long ago (2019) in this mailing
mailing reference:
https://lists.schedmd.com/pipermail/slurm-users/2019-July/003785.html
but with more recent version of slurm i.e:
slurm 21.08.8-2
PMIx 2.2.5 (pmix-2.2.5-1.el8.src.rpm)
openMPI 4.1.5
In a similar
Hi,
We have the problem that increasing numbers of new users have little to
no idea about the amount of resources their programs can use
efficiently. Thus, they will often just request 32 cores, because
that's what most of our nodes have, and 128 or 256 GB, for reasons which
are unclear to me, ev
Hi all,
i'm trying to have two overlapping partition, say normal and hi-pri,
so that when jobs are launched in the second one they can preempt the jobs
allready running in the first one, automatically putting them in suspend
state. After completition, the jobs in the normal partition must b
Hi Fabrizio,
Fabrizio Roccato writes:
> Hi all,
> i'm trying to have two overlapping partition, say normal and hi-pri,
> so that when jobs are launched in the second one they can preempt the jobs
> allready running in the first one, automatically putting them in suspend
> state. After comp
Hi all,
we have recently upgraded slurm to 23.02. Since then we are getting the
following error in our logs
May 21 03:23:27 s-sc-gpu001 slurmstepd[2723991]: error:
slurm_send_node_msg: hash_g_compute: REQUEST_STEP_COMPLETE has error
May 21 03:24:27 s-sc-gpu001 slurmstepd[2723991]: error: hash_g_co
What you are describing is definitely doable. We have our system setup
similarly. All nodes are in the "open" partition and "prio" partition, but a
job submitted to the "prio" partition will preempt the open jobs.
I don't see anything clearly wrong with your slurm.conf settings. Ours are
ver
Hi,
The release notes for 23.02 say "Added usage gathering for gpu/nvml (Nvidia)
and gpu/rsmi (AMD) plugins".
How would I go about enabling this?
Thanks!
--
Ben Fulton
Research Applications and Deep Learning
Research Technologies
Indiana University
On 5/24/23 11:39 am, Fulton, Ben wrote:
Hi,
Hi Ben,
The release notes for 23.02 say “Added usage gathering for gpu/nvml
(Nvidia) and gpu/rsmi (AMD) plugins”.
How would I go about enabling this?
I can only comment on the nvidia side (as those are the GPUs we have)
but for that you need S