I have solved this issue. All the paths were correct but I still had to use

mpirun -x LD_LIBRARY_PATH.... while executing the job.


Other option is to update your .bashrc/.cshrc.
Just add the LD_LIBRARY_PATH to the file and the update variable will be 
available on remote machines as well.
(I made an assumption here that you have the same home directory available on 
all nodes )

-Pasha


Now works like a charm.

Thanks for your responses.

On Wed, May 2, 2012 at 4:04 AM, Trent 
<tjones...@hotmail.com<mailto:tjones...@hotmail.com>> wrote:
That is discussed on here:

http://forums.nvidia.com/index.php?showtopic=227854

Maybe that could be your issue too.



From: users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org> 
[mailto:users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] On 
Behalf Of Rohan Deshpande
Sent: Tuesday, May 01, 2012 4:04 AM
To: Open MPI Users
Subject: [OMPI users] OpenMPI and CUDA on cluster

Hi,

I am trying to execute OpenMPI and CUDA code on a cluster. The code works fine 
on single machine but when I try to execute it on cluster I get error:

error while loading shared libraries: libcudart.so.4: cannot open shared object 
file: No such file or directory

I checked my PATH and LD_PATH and it looks ok. I have a .bashrc file which 
contains following entries -

export PATH=$PATH:/usr/local/lib/:/usr/local/lib/openmpi:/usr/local/ cuda/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib:/usr/local/ 
lib/openmpi/:/usr/local/cuda/lib/:

All the machines haves same installation of CUDA and OpenMPI.

Can anyone help me with this.

This problem is really annoying.

Thanks.




_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users



--

Best Regards,

ROHAN DESHPANDE



_______________________________________________
users mailing list
us...@open-mpi.org<mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to