Jeff, you know as well as I do that EVERYTHING is in the path at Cornelis Networks.
On Wed, 7 Apr 2021 at 14:59, Jeff Squyres (jsquyres) <jsquy...@cisco.com> wrote: > Check the output from ldd in a non-interactive login: your LD_LIBRARY_PATH > probably doesn't include the location of the Intel runtime. > > E.g. > > ssh othernode ldd /path/to/orted > > Your shell startup files may well differentiate between interactive and > non-interactive logins (i.e., it may set PATH / LD_LIBRARY_PATH / etc. > differently). > > > On Apr 7, 2021, at 7:21 AM, John Hearns via users < > users@lists.open-mpi.org> wrote: > > Manually log into one of your nodes. Load the modules you use in a batch > job. Run 'ldd' on your executable. > Start at the bottom and work upwards... > > By the way, have you looked at using Easybuild? Would be good to have your > input there maybe. > > > On Wed, 7 Apr 2021 at 01:01, Heinz, Michael William via users < > users@lists.open-mpi.org> wrote: > >> I’m having a heck of a time building OMPI with Intel C. Compilation goes >> fine, installation goes fine, compiling test apps (the OSU benchmarks) goes >> fine… >> >> >> >> but when I go to actually run an MPI app I get: >> >> >> >> [awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/mpirun -np 2 >> -H awbp025,awbp026,awbp027,awbp028 -x FI_PROVIDER=opa1x -x >> LD_LIBRARY_PATH=/usr/mpi/icc/openmpi-icc/lib64:/lib hostname >> >> /usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: >> libimf.so: cannot open shared object file: No such file or directory >> >> /usr/mpi/icc/openmpi-icc/bin/orted: error while loading shared libraries: >> libimf.so: cannot open shared object file: No such file or directory >> >> >> >> Looking at orted, it does seem like the binary is linking correctly: >> >> >> >> [awbp025:~/work/osu-icc](N/A)$ /usr/mpi/icc/openmpi-icc/bin/orted >> >> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file >> ess_env_module.c at line 135 >> >> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in >> file util/session_dir.c at line 107 >> >> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in >> file util/session_dir.c at line 346 >> >> [awbp025:620372] [[INVALID],INVALID] ORTE_ERROR_LOG: Bad parameter in >> file base/ess_base_std_orted.c at line 264 >> >> -------------------------------------------------------------------------- >> >> It looks like orte_init failed for some reason; your parallel process is >> >> likely to abort. There are many reasons that a parallel process can >> >> fail during orte_init; some of which are due to configuration or >> >> environment problems. This failure appears to be an internal failure; >> >> here's some additional information (which may only be relevant to an >> >> Open MPI developer): >> >> >> >> orte_session_dir failed >> >> --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS >> >> -------------------------------------------------------------------------- >> >> >> >> and… >> >> >> >> [awbp025:~/work/osu-icc](N/A)$ ldd /usr/mpi/icc/openmpi-icc/bin/orted >> >> linux-vdso.so.1 (0x00007fffc2ebf000) >> >> libopen-rte.so.40 => >> /usr/mpi/icc/openmpi-icc/lib/libopen-rte.so.40 (0x00007fdaa6404000) >> >> libopen-pal.so.40 => >> /usr/mpi/icc/openmpi-icc/lib/libopen-pal.so.40 (0x00007fdaa60bd000) >> >> libopen-orted-mpir.so => >> /usr/mpi/icc/openmpi-icc/lib/libopen-orted-mpir.so (0x00007fdaa5ebb000) >> >> libm.so.6 => /lib64/libm.so.6 (0x00007fdaa5b39000) >> >> librt.so.1 => /lib64/librt.so.1 (0x00007fdaa5931000) >> >> libutil.so.1 => /lib64/libutil.so.1 (0x00007fdaa572d000) >> >> libz.so.1 => /lib64/libz.so.1 (0x00007fdaa5516000) >> >> libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fdaa52fe000) >> >> libpthread.so.0 => /lib64/libpthread.so.0 (0x00007fdaa50de000) >> >> libc.so.6 => /lib64/libc.so.6 (0x00007fdaa4d1b000) >> >> libdl.so.2 => /lib64/libdl.so.2 (0x00007fdaa4b17000) >> >> libimf.so => >> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libimf.so >> (0x00007fdaa4494000) >> >> libsvml.so => >> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libsvml.so >> (0x00007fdaa29c4000) >> >> libirng.so => >> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libirng.so >> (0x00007fdaa2659000) >> >> libintlc.so.5 => >> /opt/intel/compilers_and_libraries_2020.4.304/linux/compiler/lib/intel64_lin/libintlc.so.5 >> (0x00007fdaa23e1000) >> >> /lib64/ld-linux-x86-64.so.2 (0x00007fdaa66d6000) >> >> >> >> Can anyone suggest what I’m forgetting to do? >> >> >> >> --- >> >> Michael Heinz >> Fabric Software Engineer, Cornelis Networks >> >> >> > > > -- > Jeff Squyres > jsquy...@cisco.com > >