Hi Patrick,

Thanks for your reply.
Ofcourse, the intel vars.sh is sourced inside the pbs script and I've tried
multiple ways to resolve this issue:

-x LD_LIBRARY_PATH
&
-x
LD_LIBRARY_PATH=/opt/intel/oneapi/2024/v2.1/compiler/2024.2/opt/compiler/lib:/opt/intel/oneapi/2024/v2.1/compiler/2024.2/lib:${LD_LIBRARY_PATH}

And then copied the libimf.so to job's working directory and set
-x
LD_LIBRARY_PATH=.:/opt/intel/oneapi/2024/v2.1/compiler/2024.2/opt/compiler/lib:/opt/intel/oneapi/2024/v2.1/compiler/2024.2/lib:${LD_LIBRARY_PATH}

But in any of the case it didn't work

On Fri, Feb 14, 2025 at 6:30 PM Patrick Begou <
patrick.be...@univ-grenoble-alpes.fr> wrote:

> Le 14/02/2025 à 13:22, Sangam B a écrit :
> > Hi,
> >
> > OpenMPI-5.0.6 is compiled with ucx-1.18 and Intel 1api 2024 v2.1
> > compilers. An mpi program is compiled with this openmpi-5.0.6.
> >
> > While submitting job thru PBS on a Linux cluster, the intel compilers
> > is sourced and the same is passed thru OpenMPI's mpirun command
> > option: " -x LD_LIBRARY_PATH=<lib path to intel compilers> ". But
> > still the job fails with following error:
> >
> > prted: error while loading shared libraries: libimf.so: cannot open
> > shared object file: No such file or directory
> >
> > PRTE has lost communication with a remote daemon.
> >
> >   HNP daemon   : [prterun-cn19-2146925@0,0] on node cn19
> >   Remote daemon: [prterun-cn19-2146925@0,2] on node cn21
> >
> > This is usually due to either a failure of the TCP network
> > connection to the node, or possibly an internal failure of
> > the daemon itself. We cannot recover from this failure, and
> > therefore will terminate the job.
> >
> > However, if put "source <path_of_intel_compiler>vars.sh" in the
> > ~/.bashrc, then job works fine. But this is not the right way to do so.
> >
> > But my question here is that, after passing -x LD_LIBRARY_PATH to
> > mpirun command, why it is not able to find the "libimf.so" on all the
> > nodes? Is this a bug with OpenMPI-5.0.6?
> >
> > Thanks
> > To unsubscribe from this group and stop receiving emails from it, send
> > an email to users+unsubscr...@lists.open-mpi.org.
>
>
> Hi Sangam,
>
> the "-x" option propagate your LD_LIBRARY_PATH as it is set on the
> execution node. So may be you need only to set "-x LD_LIBRARY_PATH"
> after sourcing your <path_of_intel_compiler>vars.sh in your PBS script ?
>
> Patrick (not using PBS but Slurm, sorry)
>
> To unsubscribe from this group and stop receiving emails from it, send an
> email to users+unsubscr...@lists.open-mpi.org.
>
>

To unsubscribe from this group and stop receiving emails from it, send an email 
to users+unsubscr...@lists.open-mpi.org.

Reply via email to