Stupid question... Why is it going 'out' to the loopback address? Is shared memory not being used these days?
On Mon, Feb 5, 2024, 8:31 PM John Haiducek via users < users@lists.open-mpi.org> wrote: > Adding '--pmixmca ptl_tcp_if_include lo0' to the mpirun argument list > seems to fix (or at least work around) the problem. > > On Mon, Feb 5, 2024 at 1:49 PM John Haiducek <jhaid...@gmail.com> wrote: > >> Thanks, George, that issue you linked certainly looks potentially related. >> >> Output from ompi_info: >> >> Package: Open MPI brew@Monterey-arm64.local Distribution >> Open MPI: 5.0.1 >> Open MPI repo revision: v5.0.1 >> Open MPI release date: Dec 20, 2023 >> MPI API: 3.1.0 >> Ident string: 5.0.1 >> Prefix: /opt/homebrew/Cellar/open-mpi/5.0.1 >> Configured architecture: aarch64-apple-darwin21.6.0 >> Configured by: brew >> Configured on: Wed Dec 20 22:18:10 UTC 2023 >> Configure host: Monterey-arm64.local >> Configure command line: '--disable-debug' >> '--disable-dependency-tracking' >> '--prefix=/opt/homebrew/Cellar/open-mpi/5.0.1' >> >> '--libdir=/opt/homebrew/Cellar/open-mpi/5.0.1/lib' >> '--disable-silent-rules' '--enable-ipv6' >> '--enable-mca-no-build=reachable-netlink' >> '--sysconfdir=/opt/homebrew/etc' >> '--with-hwloc=/opt/homebrew/opt/hwloc' >> '--with-libevent=/opt/homebrew/opt/libevent' >> '--with-pmix=/opt/homebrew/opt/pmix' >> '--with-sge' >> Built by: brew >> Built on: Wed Dec 20 22:18:10 UTC 2023 >> Built host: Monterey-arm64.local >> C bindings: yes >> Fort mpif.h: yes (single underscore) >> Fort use mpi: yes (full: ignore TKR) >> Fort use mpi size: deprecated-ompi-info-value >> Fort use mpi_f08: yes >> Fort mpi_f08 compliance: The mpi_f08 module is available, but due to >> limitations in the gfortran compiler and/or Open >> MPI, does not support the following: array >> subsections, direct passthru (where possible) to >> underlying Open MPI's C functionality >> Fort mpi_f08 subarrays: no >> Java bindings: no >> Wrapper compiler rpath: unnecessary >> C compiler: clang >> C compiler absolute: clang >> C compiler family name: CLANG >> C compiler version: 14.0.0 (clang-1400.0.29.202) >> C++ compiler: clang++ >> C++ compiler absolute: clang++ >> Fort compiler: gfortran >> Fort compiler abs: /opt/homebrew/opt/gcc/bin/gfortran >> Fort ignore TKR: yes (!GCC$ ATTRIBUTES NO_ARG_CHECK ::) >> Fort 08 assumed shape: yes >> Fort optional args: yes >> Fort INTERFACE: yes >> Fort ISO_FORTRAN_ENV: yes >> Fort STORAGE_SIZE: yes >> Fort BIND(C) (all): yes >> Fort ISO_C_BINDING: yes >> Fort SUBROUTINE BIND(C): yes >> Fort TYPE,BIND(C): yes >> Fort T,BIND(C,name="a"): yes >> Fort PRIVATE: yes >> Fort ABSTRACT: yes >> Fort ASYNCHRONOUS: yes >> Fort PROCEDURE: yes >> Fort USE...ONLY: yes >> Fort C_FUNLOC: yes >> Fort f08 using wrappers: yes >> Fort MPI_SIZEOF: yes >> C profiling: yes >> Fort mpif.h profiling: yes >> Fort use mpi profiling: yes >> Fort use mpi_f08 prof: yes >> Thread support: posix (MPI_THREAD_MULTIPLE: yes, OPAL support: >> yes, >> OMPI progress: no, Event lib: yes) >> Sparse Groups: no >> Internal debug support: no >> MPI interface warnings: yes >> MPI parameter check: runtime >> Memory profiling support: no >> Memory debugging support: no >> dl support: yes >> Heterogeneous support: no >> MPI_WTIME support: native >> Symbol vis. support: yes >> Host topology support: yes >> IPv6 support: yes >> MPI extensions: affinity, cuda, ftmpi, rocm, shortfloat >> Fault Tolerance support: yes >> FT MPI support: yes >> MPI_MAX_PROCESSOR_NAME: 256 >> MPI_MAX_ERROR_STRING: 256 >> MPI_MAX_OBJECT_NAME: 64 >> MPI_MAX_INFO_KEY: 36 >> MPI_MAX_INFO_VAL: 256 >> MPI_MAX_PORT_NAME: 1024 >> MPI_MAX_DATAREP_STRING: 128 >> MCA accelerator: null (MCA v2.1.0, API v1.0.0, Component v5.0.1) >> MCA allocator: basic (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA allocator: bucket (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA backtrace: execinfo (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA btl: self (MCA v2.1.0, API v3.3.0, Component v5.0.1) >> MCA btl: sm (MCA v2.1.0, API v3.3.0, Component v5.0.1) >> MCA btl: tcp (MCA v2.1.0, API v3.3.0, Component v5.0.1) >> MCA dl: dlopen (MCA v2.1.0, API v1.0.0, Component >> v5.0.1) >> MCA if: bsdx_ipv6 (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA if: posix_ipv4 (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA installdirs: env (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA installdirs: config (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA mpool: hugepage (MCA v2.1.0, API v3.1.0, Component >> v5.0.1) >> MCA patcher: overwrite (MCA v2.1.0, API v1.0.0, Component >> v5.0.1) >> MCA rcache: grdma (MCA v2.1.0, API v3.3.0, Component v5.0.1) >> MCA reachable: weighted (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA shmem: mmap (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA shmem: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA shmem: sysv (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA threads: pthreads (MCA v2.1.0, API v1.0.0, Component >> v5.0.1) >> MCA timer: darwin (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA bml: r2 (MCA v2.1.0, API v2.1.0, Component v5.0.1) >> MCA coll: adapt (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: basic (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: han (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: inter (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: libnbc (MCA v2.1.0, API v2.4.0, Component >> v5.0.1) >> MCA coll: self (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: sync (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: tuned (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA coll: ftagree (MCA v2.1.0, API v2.4.0, Component >> v5.0.1) >> MCA coll: monitoring (MCA v2.1.0, API v2.4.0, Component >> v5.0.1) >> MCA coll: sm (MCA v2.1.0, API v2.4.0, Component v5.0.1) >> MCA fbtl: posix (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA fcoll: dynamic (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA fcoll: dynamic_gen2 (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA fcoll: individual (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA fcoll: vulcan (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA fs: ufs (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA hook: comm_method (MCA v2.1.0, API v1.0.0, Component >> v5.0.1) >> MCA io: ompio (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA io: romio341 (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA osc: sm (MCA v2.1.0, API v3.0.0, Component v5.0.1) >> MCA osc: monitoring (MCA v2.1.0, API v3.0.0, Component >> v5.0.1) >> MCA osc: rdma (MCA v2.1.0, API v3.0.0, Component v5.0.1) >> MCA part: persist (MCA v2.1.0, API v4.0.0, Component >> v5.0.1) >> MCA pml: cm (MCA v2.1.0, API v2.1.0, Component v5.0.1) >> MCA pml: monitoring (MCA v2.1.0, API v2.1.0, Component >> v5.0.1) >> MCA pml: ob1 (MCA v2.1.0, API v2.1.0, Component v5.0.1) >> MCA pml: v (MCA v2.1.0, API v2.1.0, Component v5.0.1) >> MCA sharedfp: individual (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA sharedfp: lockedfile (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> MCA sharedfp: sm (MCA v2.1.0, API v2.0.0, Component v5.0.1) >> MCA topo: basic (MCA v2.1.0, API v2.2.0, Component v5.0.1) >> MCA topo: treematch (MCA v2.1.0, API v2.2.0, Component >> v5.0.1) >> MCA vprotocol: pessimist (MCA v2.1.0, API v2.0.0, Component >> v5.0.1) >> >> On Mon, Feb 5, 2024 at 12:48 PM George Bosilca <bosi...@icl.utk.edu> >> wrote: >> >>> OMPI seems unable to create a communication medium between your >>> processes. There are few known issues on OSX, please read >>> https://github.com/open-mpi/ompi/issues/12273 for more info. >>> >>> Can you provide the header of the ompi_info command. What I'm interested >>> on is the part about `Configure command line:` >>> >>> George. >>> >>> >>> On Mon, Feb 5, 2024 at 12:18 PM John Haiducek via users < >>> users@lists.open-mpi.org> wrote: >>> >>>> I'm having problems running programs compiled against the OpenMPI 5.0.1 >>>> package provided by homebrew on MacOS (arm) 12.6.1. >>>> >>>> When running a Fortran test program that simply calls MPI_init followed >>>> by MPI_Finalize, I get the following output: >>>> >>>> $ mpirun -n 2 ./mpi_init_test >>>> >>>> -------------------------------------------------------------------------- >>>> It looks like MPI_INIT failed for some reason; your parallel process is >>>> likely to abort. There are many reasons that a parallel process can >>>> fail during MPI_INIT; some of which are due to configuration or >>>> environment >>>> problems. This failure appears to be an internal failure; here's some >>>> additional information (which may only be relevant to an Open MPI >>>> developer): >>>> >>>> PML add procs failed >>>> --> Returned "Not found" (-13) instead of "Success" (0) >>>> >>>> -------------------------------------------------------------------------- >>>> >>>> -------------------------------------------------------------------------- >>>> It looks like MPI_INIT failed for some reason; your parallel process is >>>> likely to abort. There are many reasons that a parallel process can >>>> fail during MPI_INIT; some of which are due to configuration or >>>> environment >>>> problems. This failure appears to be an internal failure; here's some >>>> additional information (which may only be relevant to an Open MPI >>>> developer): >>>> >>>> ompi_mpi_init: ompi_mpi_instance_init failed >>>> --> Returned "Not found" (-13) instead of "Success" (0) >>>> >>>> -------------------------------------------------------------------------- >>>> [haiducek-lt:00000] *** An error occurred in MPI_Init >>>> [haiducek-lt:00000] *** reported by process [1905590273,1] >>>> [haiducek-lt:00000] *** on a NULL communicator >>>> [haiducek-lt:00000] *** Unknown error >>>> [haiducek-lt:00000] *** MPI_ERRORS_ARE_FATAL (processes in this >>>> communicator will now abort, >>>> [haiducek-lt:00000] *** and MPI will try to terminate your MPI job >>>> as well) >>>> >>>> -------------------------------------------------------------------------- >>>> prterun detected that one or more processes exited with non-zero status, >>>> thus causing the job to be terminated. The first process to do so was: >>>> >>>> Process name: [prterun-haiducek-lt-15584@1,1] Exit code: 14 >>>> >>>> -------------------------------------------------------------------------- >>>> >>>> I'm not sure whether this is the result of a bug in OpenMPI, in the >>>> homebrew package, or a misconfiguration of my system. Any suggestions for >>>> troubleshooting this? >>>> >>>