Hi, OpenMPI-5.0.6 is compiled with ucx-1.18 and Intel 1api 2024 v2.1 compilers. An mpi program is compiled with this openmpi-5.0.6.
While submitting job thru PBS on a Linux cluster, the intel compilers is sourced and the same is passed thru OpenMPI's mpirun command option: " -x LD_LIBRARY_PATH=<lib path to intel compilers> ". But still the job fails with following error: prted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory PRTE has lost communication with a remote daemon. HNP daemon : [prterun-cn19-2146925@0,0] on node cn19 Remote daemon: [prterun-cn19-2146925@0,2] on node cn21 This is usually due to either a failure of the TCP network connection to the node, or possibly an internal failure of the daemon itself. We cannot recover from this failure, and therefore will terminate the job. However, if put "source <path_of_intel_compiler>vars.sh" in the ~/.bashrc, then job works fine. But this is not the right way to do so. But my question here is that, after passing -x LD_LIBRARY_PATH to mpirun command, why it is not able to find the "libimf.so" on all the nodes? Is this a bug with OpenMPI-5.0.6? Thanks To unsubscribe from this group and stop receiving emails from it, send an email to users+unsubscr...@lists.open-mpi.org.