Adding '--pmixmca ptl_tcp_if_include lo0' to the mpirun argument list seems
to fix (or at least work around) the problem.

On Mon, Feb 5, 2024 at 1:49 PM John Haiducek <jhaid...@gmail.com> wrote:

> Thanks, George, that issue you linked certainly looks potentially related.
>
> Output from ompi_info:
>
>                  Package: Open MPI brew@Monterey-arm64.local Distribution
>                 Open MPI: 5.0.1
>   Open MPI repo revision: v5.0.1
>    Open MPI release date: Dec 20, 2023
>                  MPI API: 3.1.0
>             Ident string: 5.0.1
>                   Prefix: /opt/homebrew/Cellar/open-mpi/5.0.1
>  Configured architecture: aarch64-apple-darwin21.6.0
>            Configured by: brew
>            Configured on: Wed Dec 20 22:18:10 UTC 2023
>           Configure host: Monterey-arm64.local
>   Configure command line: '--disable-debug' '--disable-dependency-tracking'
>                           '--prefix=/opt/homebrew/Cellar/open-mpi/5.0.1'
>
> '--libdir=/opt/homebrew/Cellar/open-mpi/5.0.1/lib'
>                           '--disable-silent-rules' '--enable-ipv6'
>                           '--enable-mca-no-build=reachable-netlink'
>                           '--sysconfdir=/opt/homebrew/etc'
>                           '--with-hwloc=/opt/homebrew/opt/hwloc'
>                           '--with-libevent=/opt/homebrew/opt/libevent'
>                           '--with-pmix=/opt/homebrew/opt/pmix' '--with-sge'
>                 Built by: brew
>                 Built on: Wed Dec 20 22:18:10 UTC 2023
>               Built host: Monterey-arm64.local
>               C bindings: yes
>              Fort mpif.h: yes (single underscore)
>             Fort use mpi: yes (full: ignore TKR)
>        Fort use mpi size: deprecated-ompi-info-value
>         Fort use mpi_f08: yes
>  Fort mpi_f08 compliance: The mpi_f08 module is available, but due to
>                           limitations in the gfortran compiler and/or Open
>                           MPI, does not support the following: array
>                           subsections, direct passthru (where possible) to
>                           underlying Open MPI's C functionality
>   Fort mpi_f08 subarrays: no
>            Java bindings: no
>   Wrapper compiler rpath: unnecessary
>               C compiler: clang
>      C compiler absolute: clang
>   C compiler family name: CLANG
>       C compiler version: 14.0.0 (clang-1400.0.29.202)
>             C++ compiler: clang++
>    C++ compiler absolute: clang++
>            Fort compiler: gfortran
>        Fort compiler abs: /opt/homebrew/opt/gcc/bin/gfortran
>          Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::)
>    Fort 08 assumed shape: yes
>       Fort optional args: yes
>           Fort INTERFACE: yes
>     Fort ISO_FORTRAN_ENV: yes
>        Fort STORAGE_SIZE: yes
>       Fort BIND(C) (all): yes
>       Fort ISO_C_BINDING: yes
>  Fort SUBROUTINE BIND(C): yes
>        Fort TYPE,BIND(C): yes
>  Fort T,BIND(C,name="a"): yes
>             Fort PRIVATE: yes
>            Fort ABSTRACT: yes
>        Fort ASYNCHRONOUS: yes
>           Fort PROCEDURE: yes
>          Fort USE...ONLY: yes
>            Fort C_FUNLOC: yes
>  Fort f08 using wrappers: yes
>          Fort MPI_SIZEOF: yes
>              C profiling: yes
>    Fort mpif.h profiling: yes
>   Fort use mpi profiling: yes
>    Fort use mpi_f08 prof: yes
>           Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support:
> yes,
>                           OMPI progress: no, Event lib: yes)
>            Sparse Groups: no
>   Internal debug support: no
>   MPI interface warnings: yes
>      MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>               dl support: yes
>    Heterogeneous support: no
>        MPI_WTIME support: native
>      Symbol vis. support: yes
>    Host topology support: yes
>             IPv6 support: yes
>           MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat
>  Fault Tolerance support: yes
>           FT MPI support: yes
>   MPI_MAX_PROCESSOR_NAME: 256
>     MPI_MAX_ERROR_STRING: 256
>      MPI_MAX_OBJECT_NAME: 64
>         MPI_MAX_INFO_KEY: 36
>         MPI_MAX_INFO_VAL: 256
>        MPI_MAX_PORT_NAME: 1024
>   MPI_MAX_DATAREP_STRING: 128
>          MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.1)
>            MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>            MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>            MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component
> v5.0.1)
>                  MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>                  MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>                  MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>                   MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component v5.0.1)
>                   MCA if: bsdx_ipv6 (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>                   MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>          MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>          MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component
> v5.0.1)
>              MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component
>                           v5.0.1)
>               MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.1)
>            MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component
> v5.0.1)
>                MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>              MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component
> v5.0.1)
>                MCA timer: darwin (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                  MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>                 MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component
> v5.0.1)
>                 MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component
>                           v5.0.1)
>                 MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.1)
>                 MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component
> v5.0.1)
>                MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>                MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>                MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                   MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                 MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component
>                           v5.0.1)
>                   MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                   MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component
> v5.0.1)
>                  MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.1)
>                  MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component
>                           v5.0.1)
>                  MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.1)
>                 MCA part: persist (MCA v2.1.0, API v4.0.0, Component
> v5.0.1)
>                  MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>                  MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component
>                           v5.0.1)
>                  MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>                  MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.1)
>             MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>             MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>             MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.1)
>                 MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.1)
>                 MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component
>                           v5.0.1)
>            MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component
>                           v5.0.1)
>
> On Mon, Feb 5, 2024 at 12:48 PM George Bosilca <bosi...@icl.utk.edu>
> wrote:
>
>> OMPI seems unable to create a communication medium between your
>> processes. There are few known issues on OSX, please read
>> https://github.com/open-mpi/ompi/issues/12273 for more info.
>>
>> Can you provide the header of the ompi_info command. What I'm interested
>> on is the part about `Configure command line:`
>>
>> George.
>>
>>
>> On Mon, Feb 5, 2024 at 12:18 PM John Haiducek via users <
>> users@lists.open-mpi.org> wrote:
>>
>>> I'm having problems running programs compiled against the OpenMPI 5.0.1
>>> package provided by homebrew on MacOS (arm) 12.6.1.
>>>
>>> When running a Fortran test program that simply calls MPI_init followed
>>> by MPI_Finalize, I get the following output:
>>>
>>> $ mpirun -n 2 ./mpi_init_test
>>>
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or
>>> environment
>>> problems.  This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>>   PML add procs failed
>>>   --> Returned "Not found" (-13) instead of "Success" (0)
>>>
>>> --------------------------------------------------------------------------
>>>
>>> --------------------------------------------------------------------------
>>> It looks like MPI_INIT failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during MPI_INIT; some of which are due to configuration or
>>> environment
>>> problems.  This failure appears to be an internal failure; here's some
>>> additional information (which may only be relevant to an Open MPI
>>> developer):
>>>
>>>   ompi_mpi_init: ompi_mpi_instance_init failed
>>>   --> Returned "Not found" (-13) instead of "Success" (0)
>>>
>>> --------------------------------------------------------------------------
>>> [haiducek-lt:00000] *** An error occurred in MPI_Init
>>> [haiducek-lt:00000] *** reported by process [1905590273,1]
>>> [haiducek-lt:00000] *** on a NULL communicator
>>> [haiducek-lt:00000] *** Unknown error
>>> [haiducek-lt:00000] *** MPI_ERRORS_ARE_FATAL (processes in this
>>> communicator will now abort,
>>> [haiducek-lt:00000] ***    and MPI will try to terminate your MPI job as
>>> well)
>>>
>>> --------------------------------------------------------------------------
>>> prterun detected that one or more processes exited with non-zero status,
>>> thus causing the job to be terminated. The first process to do so was:
>>>
>>>    Process name: [prterun-haiducek-lt-15584@1,1] Exit code:    14
>>>
>>> --------------------------------------------------------------------------
>>>
>>> I'm not sure whether this is the result of a bug in OpenMPI, in the
>>> homebrew package, or a misconfiguration of my system. Any suggestions for
>>> troubleshooting this?
>>>
>>

Reply via email to