Hi,
Angel de Vicente via users writes:
> I have tried:
> + /etc/pmix-mca-params.conf
> + /usr/lib/x86_64-linux-gnu/pmix2/etc/pmix-mca.params.conf
> but no luck.
Never mind, /etc/openmpi/pmix-mca-params.conf was the right one.
Cheers,
--
Ángel de Vicente
Hello,
with our current setting of OpenMPI and Slurm in a Ubuntu 22.04 server,
when we submit MPI jobs I get the message:
PMIX ERROR: ERROR in file
../../../../../../src/mca/gds/ds12/gds_ds12_lock_pthread.c at line 169
Following https://github.com/open-mpi/ompi/issues/7516, I tried setting
PMIX_
Hello,
thanks for your help and suggestions.
At the end it was no issue with OpenMPI or with any other system stuff,
but rather a single line in our code. I thought I was doing the tests
with the -fbounds-check option, but it turns out I was not, arrrghh!! At
some point I was writing outside one
Hello,
"Keller, Rainer" writes:
> You’re using MPI_Probe() with Threads; that’s not safe.
> Please consider using MPI_Mprobe() together with MPI_Mrecv().
many thanks for the suggestion. I will try with the M variants, though I
was under the impression that mpi_probe() was OK as far as one made
Hello Jeff,
"Jeff Squyres (jsquyres)" writes:
> With THREAD_FUNNELED, it means that there can only be one thread in
> MPI at a time -- and it needs to be the same thread as the one that
> called MPI_INIT_THREAD.
>
> Is that the case in your app?
the master rank (i.e. 0) never creates threads,
Thanks Gilles,
Gilles Gouaillardet via users writes:
> You can first double check you
> MPI_Init_thread(..., MPI_THREAD_MULTIPLE, ...)
my code uses "mpi_thread_funneled" and OpenMPI was compiled with
MPI_THREAD_MULTIPLE support:
,
| ompi_info | grep -i thread
| Thread support: p
Hello,
I'm running out of ideas, and wonder if someone here could have some
tips on how to debug a segmentation fault I'm having with my
application [due to the nature of the problem I'm wondering if the
problem is with OpenMPI itself rather than my app, though at this point
I'm not leaning strong
Hello,
Joshua Ladd writes:
> These are very, very old versions of UCX and HCOLL installed in your
> environment. Also, MXM was deprecated years ago in favor of UCX. What
> version of MOFED is installed (run ofed_info -s)? What HCA generation
> is present (run ibstat).
MOFED is: MLNX_OFED_LINUX
Hello,
John Hearns via users writes:
> Stupid answer from me. If latency/bandwidth numbers are bad then check
> that you are really running over the interface that you think you
> should be. You could be falling back to running over Ethernet.
I'm quite out of my depth here, so all answers are h
Hello,
"Jeff Squyres (jsquyres)" writes:
> I'd recommend against using Open MPI v3.1.0 -- it's quite old. If you
> have to use Open MPI v3.1.x, I'd at least suggest using v3.1.6, which
> has all the rolled-up bug fixes on the v3.1.x series.
>
> That being said, Open MPI v4.1.2 is the most curre
Hello,
Gilles Gouaillardet via users writes:
> Infiniband detection likely fails before checking expanded verbs.
thanks for this. At the end, after playing a bit with different options,
I managed to install OpenMPI 3.1.0 OK in our cluster using UCX (I wanted
4.1.1, but that would not compile cl
Hi,
I'm trying to compile the latest OpenMPI version with Infiniband support
in our local cluster, but didn't get very far (since I'm installing this
via Spack, I also asked in their support group).
I'm doing the installation via Spack, which is issuing the following
.configure step (see the opti
Hi,
Joshua Ladd writes:
> This is an ancient version of HCOLL. Please upgrade to the latest
> version (you can do this by installing HPC-X
> https://www.mellanox.com/products/hpc-x-toolkit)
Just to close the circle and inform that all seems OK now.
I don't have root permission in this machine
Hi,
Joshua Ladd writes:
> We cannot reproduce this. On four nodes 20 PPN with and w/o hcoll it
> takes exactly the same 19 secs (80 ranks).
>
> What version of HCOLL are you using? Command line?
Thanks for having a look at this.
According to ompi_info, our OpenMPI (version 3.0.1) was config
Hi,
George Bosilca writes:
> If I'm not mistaken, hcoll is playing with the opal_progress in a way
> that conflicts with the blessed usage of progress in OMPI and prevents
> other components from advancing and timely completing requests. The
> impact is minimal for sequential applications using
Hi,
in one of our codes, we want to create a log of events that happen in
the MPI processes, where the number of these events and their timing is
unpredictable.
So I implemented a simple test code, where process 0
creates a thread that is just busy-waiting for messages from any
process, and which
16 matches
Mail list logo