HI Craig,
Its not essential to use the pmix lib used to build the SLURM pmix plugin but
it does reduce likelihood of problems.
I don’t know how, but there is some way that the admin installing SLURM can
“name” the available pmix –mpi options.
For instance on one of our systems, the admin has bui
conf.log...
checking if user requested PMI support
result: no
checking if user requested internal PMIx support(yes)
result: no
checking for pmix.h in /usr
result: not found
checking for pmix.h in /usr/include
result: not found
WARNING: discovered external PMIx version
HI Craig,
Your use of the –with-pmix on the open mpi configure line is important.
Without any args to this configure option open mpi configure will first check
if there’s an external pmix which is newer than the one that is included in the
openmpi release tarball. If it is not, the internal
srun: MPI types are...
srun: none
srun: openmpi
srun: pmix_v3
srun: pmi2
srun: pmix
but I'm not sure that tells me much about how I am supposed to be
building OpenMPI?
On 3/27/23 14:41, Pritchard Jr., Howard wrote:
HI Craig,
If you run
srun –mpi=list
what does slurm report?
That will he
HI Craig,
If you run
srun –mpi=list
what does slurm report?
That will help in determining what argument you want to supply for the –mpi
srun option.
Howard
From: slurm-users on behalf of Craig
Reply-To: Slurm User Community List
Date: Monday, March 27, 2023 at 12:38 PM
To: "slurm-users@
Hi Prentice,
Since the last message I figured out a way to implement power_save:
I've documented our setup in this Wiki page:
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_cloud_bursting/#configuring-slurm-conf-for-power-saving
This page contains a link to power_save scripts on GitHub.
Best r
Can someone please clarify the "best practices" for building OpenMPI
compatible with Slurm?
https://slurm.schedmd.com/mpi_guide.html#open_mpi tells me what I _can_
do but I'm unclear as to what I _should_ do.
I've built OpenMPI 4.1.5 with: --with-pmix --with-libevent=internal
--with-hwl
I'm just catching up on old mailing list messages now. Why not make your
SuspendProgram and ResumePrograms be shell scripts that look at some
node information in Slurm (look at the features as in your example) or
some other source ( use a case statement based on node names) and call
the correct
Sorry William for the long time in not replying (almost exactly a year!) your
note was sent to my spam folder and I lost access to that cluster so it became
less of a concern.
I recently got access to another system and had the same issue even with a
local epilog with just /bin/true in it. Thi
Hi Thomas,
FYI: Slurm power_save works very well for us without the issues that you
describe below. We run Slurm 22.05.8, what's your version?
I've documented our setup in this Wiki page:
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_cloud_bursting/#configuring-slurm-conf-for-power-saving
T
Am Mon, 06 Mar 2023 13:35:38 +0100
schrieb Stefan Staeglich :
> But this fixed not the main error but might have reduced the frequency of
> occurring. Has someone observed similar issues? We will try a higher
> SuspendTimeout.
We had issues with power saving. We powered the idle nodes off, caus
Hi Ümit,
Thanks for the reply. Yes, it looks like this is the issue. Although
from the master branch it suggests that the claim_field can also be used
but this is not in the version we have deployed.
Cheers,
Laurence
On 24.03.23 16:51, Ümit Seren wrote:
Looks like you are missing the userna
12 matches
Mail list logo