Hello, Gilles Gouaillardet via users <users@lists.open-mpi.org> writes:
> Infiniband detection likely fails before checking expanded verbs. thanks for this. At the end, after playing a bit with different options, I managed to install OpenMPI 3.1.0 OK in our cluster using UCX (I wanted 4.1.1, but that would not compile cleanly with the old version of UCX that is installed in the cluster). The configure command line (as reported by ompi_info) was: ,---- | Configure command line: '--prefix=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/openmpi-3.1.0-g5a7szwxcsgmyibqvwwavfkz5b4i2ym7' | '--enable-shared' '--disable-silent-rules' | '--disable-builtin-atomics' '--with-pmi=/usr' | '--with-zlib=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/zlib-1.2.11-hrstx5ffrg4f4k3xc2anyxed3mmgdcoz' | '--without-knem' '--with-hcoll=/opt/mellanox/hcoll' | '--without-psm' '--without-ofi' '--without-cma' | '--with-ucx=/opt/ucx' '--without-fca' | '--without-mxm' '--without-verbs' '--without-xpmem' | '--without-psm2' '--without-alps' '--without-lsf' | '--without-sge' '--with-slurm' '--without-tm' | '--without-loadleveler' '--disable-memchecker' | '--with-hwloc=/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-9.3.0/hwloc-1.11.13-kpjkidab37wn25h2oyh3eva43ycjb6c5' | '--disable-java' '--disable-mpi-java' | '--without-cuda' '--enable-wrapper-rpath' | '--disable-wrapper-runpath' '--disable-mpi-cxx' | '--disable-cxx-exceptions' | '--with-wrapper-ldflags=-Wl,-rpath,/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-7.2.0/gcc-9.3.0-ghr2jekwusoa4zip36xsa3okgp3bylqm/lib/gcc/x86_\ | 64-pc-linux-gnu/9.3.0 | -Wl,-rpath,/storage/projects/can30/angelv/spack/opt/spack/linux-sles12-sandybridge/gcc-7.2.0/gcc-9.3.0-ghr2jekwusoa4zip36xsa3okgp3bylqm/lib64' `---- The versions that I'm using are: gcc: 9.3.0 mxm: 3.6.3102 (though I configure OpenMPI --without-mxm) hcoll: 3.8.1649 knem: 1.1.2.90mlnx2 (though I configure OpenMPI --without-knem) ucx: 1.2.2947 slurm: 18.08.7 It looks like everything executes fine, but I have a couple of warnings, and I'm not sure how much I should worry and what I could do about them: 1) Conflicting CPU frequencies detected: [1645221586.038838] [s01r3b78:11041:0] sys.c:744 MXM WARN Conflicting CPU frequencies detected, using: 3151.41 [1645221585.740595] [s01r3b79:11484:0] sys.c:744 MXM WARN Conflicting CPU frequencies detected, using: 2998.76 2) Won't use knem. In a previous try, I was specifying --with-knem, but I was getting this warning about not being able to open /dev/knem. I guess our cluster is not properly configured w.r.t knem, so I built OpenMPI again --without-knem, but I still get this message? [1645221587.091122] [s01r3b74:9054 :0] shm.c:65 MXM WARN Could not open the KNEM device file at /dev/knem : No such file or directory. Won't use knem. [1645221587.104807] [s01r3b76:8610 :0] shm.c:65 MXM WARN Could not open the KNEM device file at /dev/knem : No such file or directory. Won't use knem. Any help/pointers appreciated. Many thanks, -- Ángel de Vicente Tel.: +34 922 605 747 Web.: http://research.iac.es/proyecto/polmag/ --------------------------------------------------------------------------------------------- AVISO LEGAL: Este mensaje puede contener información confidencial y/o privilegiada. Si usted no es el destinatario final del mismo o lo ha recibido por error, por favor notifíquelo al remitente inmediatamente. Cualquier uso no autorizadas del contenido de este mensaje está estrictamente prohibida. Más información en: https://www.iac.es/es/responsabilidad-legal DISCLAIMER: This message may contain confidential and / or privileged information. If you are not the final recipient or have received it in error, please notify the sender immediately. Any unauthorized use of the content of this message is strictly prohibited. More information: https://www.iac.es/en/disclaimer