Hello Gilles,

thank you very much for the prompt patch. 

I can confirm that configure now prefers the external PMIx. I can
confirm that the munge warnings and PMIx errors we observed are gone. An
mpi hello world runs successfully with srun --mpi=pmix and --mpi=pmi2.

I noticed that configure complained loudly about a missing external
libevent (i.e. libevent-devel package), but did not complain at all
that an external hwloc-devel was also missing.

Best Regards

Christof



On Sat, May 20, 2023 at 06:54:54PM +0900, Gilles Gouaillardet wrote:
> Christof,
> 
> Open MPI switching to the internal PMIx is a bug I addressed in
> https://github.com/open-mpi/ompi/pull/11704
> 
> Feel free to manually download and apply the patch, you will then need
> recent autotools and run
> ./autogen.pl --force
> 
> An other option is to manually edit the configure file
> 
> Look for the following snippet
> 
>            # Final check - if they didn't point us explicitly at an
> external version
> 
>            # but we found one anyway, use the internal version if it is
> higher
> 
>            if test "$opal_external_pmix_version" != "internal" && (test -z
> "$with_pmix" || test "$with_pmix" = "yes")
> 
> then :
> 
>   if test "$opal_external_pmix_version" != "3x"
> 
> 
> and replace the last line with
> 
>   if test $opal_external_pmix_version_major -lt 3
> 
> 
> Cheers,
> 
> Gilles
> 
> On Sat, May 20, 2023 at 6:13 PM christof.koehler--- via users <
> users@lists.open-mpi.org> wrote:
> 
> > Hello Z. Matthias Krawutschke,
> >
> > On Fri, May 19, 2023 at 09:08:08PM +0200, Zhéxué M. Krawutschke wrote:
> > > Hello Christoph,
> > > what exactly is your problem with OpenMPI and Slurm?
> > > Do you compile the products yourself? Which LINUX distribution and
> > version are you using?
> > >
> > > If you compile the software yourself, could you please tell me what the
> > "configure" command looks like and which MUNGE version is in use? From the
> > distribution or compiled by yourself?
> > >
> > > I would be very happy to take on this topic and help you. You can also
> > reach me at +49 176 67270992.
> > > Best regards from Berlin
> >
> > please refer to (especially the end) of my first mail in this thread
> > which is available here
> > https://www.mail-archive.com/users@lists.open-mpi.org/msg35141.html
> >
> > I believe this contains the relevant information you are requesting. The
> > second mail which you are replying to was just additional information.
> > My apologies if this led to confusion.
> >
> > Please let me know if any relevant information is missing from my first
> > email. At the bottom of this email I include the ompi_info output as
> > further addendum.
> >
> > To summarize: I would like to understand where the munge warning
> > and PMIx error described in the first email (and the github link
> > included) come from. The explanation in the github issue
> > does not appear to be correct as all munge libraries are
> > available everywhere. To me, it appears at the moment that OpenMPIs
> > configure decides erroneously to build and use the internal pmix
> > instead of using the (presumably) newer externally available PMIx,
> > leading to launcher problems with srun.
> >
> >
> > Best Regards
> >
> > Christof
> >
> >                  Package: Open MPI root@admin.service Distribution
> >                 Open MPI: 4.1.5
> >   Open MPI repo revision: v4.1.5
> >    Open MPI release date: Feb 23, 2023
> >                 Open RTE: 4.1.5
> >   Open RTE repo revision: v4.1.5
> >    Open RTE release date: Feb 23, 2023
> >                     OPAL: 4.1.5
> >       OPAL repo revision: v4.1.5
> >        OPAL release date: Feb 23, 2023
> >                  MPI API: 3.1.0
> >             Ident string: 4.1.5
> >                   Prefix: /cluster/mpi/openmpi/4.1.5/gcc-11.3.1
> >  Configured architecture: x86_64-pc-linux-gnu
> >           Configure host: admin.service
> >            Configured by: root
> >            Configured on: Wed May 17 18:45:42 UTC 2023
> >           Configure host: admin.service
> >   Configure command line: '--enable-mpi1-compatibility'
> > '--enable-orterun-prefix-by-default'
> > '--with-ofi=/cluster/libraries/libfabric/1.18.0/' '--with-slurm'
> > '--with-pmix' '--with-pmix-libdir=/usr/lib64' '--with-pmi'
> > '--with-pmi-libdir=/usr/lib64'
> > '--prefix=/cluster/mpi/openmpi/4.1.5/gcc-11.3.1'
> >                 Built by: root
> >                 Built on: Wed May 17 06:48:36 PM UTC 2023
> >               Built host: admin.service
> >               C bindings: yes
> >             C++ bindings: no
> >              Fort mpif.h: yes (all)
> >             Fort use mpi: yes (full: ignore TKR)
> >        Fort use mpi size: deprecated-ompi-info-value
> >         Fort use mpi_f08: yes
> >  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
> > limitations in the gfortran compiler and/or Open MPI, does not support
> > the following: array subsections, direct passthru (where possible) to
> > underlying Open MPI's C functionality
> >   Fort mpi_f08 subarrays: no
> >            Java bindings: no
> >   Wrapper compiler rpath: runpath
> >               C compiler: gcc
> >      C compiler absolute: /usr/bin/gcc
> >   C compiler family name: GNU
> >       C compiler version: 11.3.1
> >             C++ compiler: g++
> >    C++ compiler absolute: /usr/bin/g++
> >            Fort compiler: gfortran
> >        Fort compiler abs: /usr/bin/gfortran
> >          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
> >    Fort 08 assumed shape: yes
> >       Fort optional args: yes
> >           Fort INTERFACE: yes
> >     Fort ISO_FORTRAN_ENV: yes
> >        Fort STORAGE_SIZE: yes
> >       Fort BIND(C) (all): yes
> >       Fort ISO_C_BINDING: yes
> >  Fort SUBROUTINE BIND(C): yes
> >        Fort TYPE,BIND(C): yes
> >  Fort T,BIND(C,name="a"): yes
> >             Fort PRIVATE: yes
> >           Fort PROTECTED: yes
> >            Fort ABSTRACT: yes
> >        Fort ASYNCHRONOUS: yes
> >           Fort PROCEDURE: yes
> >          Fort USE...ONLY: yes
> >            Fort C_FUNLOC: yes
> >  Fort f08 using wrappers: yes
> >          Fort MPI_SIZEOF: yes
> >              C profiling: yes
> >            C++ profiling: no
> >    Fort mpif.h profiling: yes
> >   Fort use mpi profiling: yes
> >    Fort use mpi_f08 prof: yes
> >           C++ exceptions: no
> >           Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support:
> > yes, OMPI progress: no, ORTE progress: yes, Event lib: yes)
> >            Sparse Groups: no
> >   Internal debug support: no
> >   MPI interface warnings: yes
> >      MPI parameter check: runtime
> > Memory profiling support: no
> > Memory debugging support: no
> >               dl support: yes
> >    Heterogeneous support: no
> >  mpirun default --prefix: yes
> >        MPI_WTIME support: native
> >      Symbol vis. support: yes
> >    Host topology support: yes
> >             IPv6 support: no
> >       MPI1 compatibility: yes
> >           MPI extensions: affinity, cuda, pcollreq
> >    FT Checkpoint support: no (checkpoint thread: no)
> >    C/R Enabled Debugging: no
> >   MPI_MAX_PROCESSOR_NAME: 256
> >     MPI_MAX_ERROR_STRING: 256
> >      MPI_MAX_OBJECT_NAME: 64
> >         MPI_MAX_INFO_KEY: 36
> >         MPI_MAX_INFO_VAL: 256
> >        MPI_MAX_PORT_NAME: 1024
> >   MPI_MAX_DATAREP_STRING: 128
> >            MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >            MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA btl: ofi (MCA v2.1.0, API v3.1.0, Component v4.1.5)
> >                  MCA btl: self (MCA v2.1.0, API v3.1.0, Component
> > v4.1.5)
> >                  MCA btl: openib (MCA v2.1.0, API v3.1.0, Component
> > v4.1.5)
> >                  MCA btl: vader (MCA v2.1.0, API v3.1.0, Component
> > v4.1.5)
> >                  MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.1.5)
> >                  MCA btl: usnic (MCA v2.1.0, API v3.1.0, Component
> > v4.1.5)
> >             MCA compress: gzip (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >             MCA compress: bzip (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA crs: none (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA event: libevent2022 (MCA v2.1.0, API v2.0.0,
> > Component v4.1.5)
> >                MCA hwloc: hwloc201 (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA if: linux_ipv6 (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >          MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >          MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >               MCA memory: patcher (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA mpool: hugepage (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >              MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                 MCA pmix: flux (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA pmix: pmix3x (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA pmix: s2 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                 MCA pmix: isolated (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA pmix: s1 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                MCA pstat: linux (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >               MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component
> > v4.1.5)
> >            MCA reachable: netlink (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >            MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA timer: linux (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >               MCA errmgr: default_orted (MCA v2.1.0, API v3.0.0,
> > Component v4.1.5)
> >               MCA errmgr: default_app (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >               MCA errmgr: default_tool (MCA v2.1.0, API v3.0.0,
> > Component v4.1.5)
> >               MCA errmgr: default_hnp (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA ess: pmi (MCA v2.1.0, API v3.0.0, Component v4.1.5)
> >                  MCA ess: tool (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA ess: slurm (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA ess: singleton (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA ess: hnp (MCA v2.1.0, API v3.0.0, Component v4.1.5)
> >                  MCA ess: env (MCA v2.1.0, API v3.0.0, Component v4.1.5)
> >                MCA filem: raw (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >              MCA grpcomm: direct (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA iof: orted (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA iof: hnp (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                  MCA iof: tool (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA odls: pspawn (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA odls: default (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA oob: tcp (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                  MCA plm: rsh (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                  MCA plm: slurm (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA plm: isolated (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA ras: slurm (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA ras: simulator (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA regx: fwd (MCA v2.1.0, API v1.0.0, Component v4.1.5)
> >                 MCA regx: reverse (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                 MCA regx: naive (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA rmaps: rank_file (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA rmaps: ppr (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                MCA rmaps: mindist (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA rmaps: seq (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                MCA rmaps: resilient (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA rmaps: round_robin (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA rml: oob (MCA v2.1.0, API v3.0.0, Component v4.1.5)
> >               MCA routed: radix (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >               MCA routed: binomial (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >               MCA routed: direct (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA rtc: hwloc (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >               MCA schizo: flux (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >               MCA schizo: orte (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >               MCA schizo: jsm (MCA v2.1.0, API v1.0.0, Component v4.1.5)
> >               MCA schizo: ompi (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >               MCA schizo: slurm (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA state: novm (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA state: tool (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA state: orted (MCA v2.1.0, API v1.0.0, Component
> > v4.1.5)
> >                MCA state: hnp (MCA v2.1.0, API v1.0.0, Component v4.1.5)
> >                MCA state: app (MCA v2.1.0, API v1.0.0, Component v4.1.5)
> >                  MCA bml: r2 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                 MCA coll: tuned (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: libnbc (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: basic (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: monitoring (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: self (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: sync (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: sm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                 MCA coll: adapt (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: inter (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                 MCA coll: han (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >                 MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0,
> > Component v4.1.5)
> >                MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA fcoll: two_phase (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                   MCA io: romio321 (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA io: ompio (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA mtl: ofi (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                  MCA mtl: psm2 (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                   MCA op: avx (MCA v2.1.0, API v1.0.0, Component v4.1.5)
> >
> >                  MCA osc: pt2pt (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component
> > v4.1.5)
> >                  MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v4.1.5)
> >
> >                  MCA pml: v (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                  MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                  MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                  MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >                  MCA rte: orte (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >             MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >             MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >             MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v4.1.5)
> >
> >                 MCA topo: basic (MCA v2.1.0, API v2.2.0, Component
> > v4.1.5)
> >                 MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
> > v4.1.5)
> >            MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
> > v4.1.5)
> >
> >
> > >
> > > Z. Matthias Krawutschke
> > >
> > > > On Donnerstag, Mai 18, 2023 at 5:47 PM, christof.koehler--- via users <
> > users@lists.open-mpi.org (mailto:users@lists.open-mpi.org)> wrote:
> > > > Hello again,
> > > >
> > > > I should add that the openmpi configure decides to use the internal
> > pmix
> > > >
> > > > configure: WARNING: discovered external PMIx version is less than
> > internal version 3.x
> > > > configure: WARNING: using internal PMIx
> > > > ...
> > > > ...
> > > > checking if user requested PMI support... yes
> > > > checking for pmi.h in /usr/include... not found
> > > > checking for pmi.h in /usr/include/slurm... found
> > > > checking pmi.h usability... yes
> > > > checking pmi.h presence... yes
> > > > checking for pmi.h... yes
> > > > checking for libpmi in /usr/lib64... found
> > > > checking for PMI_Init in -lpmi... yes
> > > > checking for pmi2.h in /usr/include... not found
> > > > checking for pmi2.h in /usr/include/slurm... found
> > > > checking pmi2.h usability... yes
> > > > checking pmi2.h presence... yes
> > > > checking for pmi2.h... yes
> > > > checking for libpmi2 in /usr/lib64... found
> > > > checking for PMI2_Init in -lpmi2... yes
> > > > checking for pmix.h in ... not found
> > > > checking for pmix.h in /include... not found
> > > > checking can PMI support be built... yes
> > > > checking if user requested internal PMIx support(yes)... no
> > > > checking for pmix.h in /usr... not found
> > > > checking for pmix.h in /usr/include... found
> > > > checking libpmix.* in /usr/lib64... found
> > > > checking PMIx version... version file found
> > > > checking version 4x... found
> > > > checking PMIx version to be used... internal
> > > >
> > > > I am not sure how it decides that, the external one is already a quite
> > > > new version.
> > > >
> > > > # srun --mpi=list
> > > > MPI plugin types are...
> > > > pmix
> > > > cray_shasta
> > > > none
> > > > pmi2
> > > > specific pmix plugin versions available: pmix_v4
> > > >
> > > >
> > > > Best Regards
> > > >
> > > > Christof
> > > >
> > > > --
> > > > Dr. rer. nat. Christof Köhler email: c.koeh...@uni-bremen.de
> > > > Universitaet Bremen/FB1/BCCMS phone: +49-(0)421-218-62334
> > > > Am Fallturm 1/ TAB/ Raum 3.06 fax: +49-(0)421-218-62770
> > > > 28359 Bremen
> > > >
> >
> > --
> > Dr. rer. nat. Christof Köhler       email: c.koeh...@uni-bremen.de
> > Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
> > Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
> > 28359 Bremen
> >

-- 
Dr. rer. nat. Christof Köhler       email: c.koeh...@uni-bremen.de
Universitaet Bremen/FB1/BCCMS       phone:  +49-(0)421-218-62334
Am Fallturm 1/ TAB/ Raum 3.06       fax: +49-(0)421-218-62770
28359 Bremen  

Reply via email to