Hi,
I'm working on refreshing an old cluster with Almalinux 9 (instead of
CentOS6 😕) and building a fresh OpenMPI 5.0.5 environment. I've reached
the step where OpenMPI begins to work with ucx 1.17 and Pmix 5.0.3 but
not totally. Nodes are using a Qlogic QDR HBA with a managed Qlogic
switch (
If this is a QLogic system why not try psm2 (--mca pml cm --mca mtl psm2)? Not sure how
good UCX support is over these systems and psm2 is the vendor's library. Not sure what
the right link is to the current version but found this version: GitHub -
cornelisnetworks/opa-psm2 github.com -Nathan O
Hi Nathan
thanks for this suggestion. I have understood that now all is managed by
the UCX layer. Am I wrong ?
These options do not seams to work with my openMPI 5.0.5 build. But I've
built OpenMPI on the cluster front-end and it had no HBA at this time.
I've added one this evening (an old sp
Jeff,
there are several options...
First if you want to do containers and you are not tight to docker,
singularity is a better fit.
If you have a resource manager that features a PMIx server, you would
simply direct run.
For example with SLURM:
srun singularity exec container.sif a.out
I do not
Gilles,
This was exactly it - thank you.
If I wanted to run the code in the container across multiple nodes, I would
need to do something like "mpirun ... 'docker run ...' "?
Thanks!
Jeff
On Mon, Sep 30, 2024 at 2:38 AM Gilles Gouaillardet via users <
users@lists.open-mpi.org> wrote:
> Jeffr