[OMPI users] libimf.so Error
I just installed Open MPI on our cluster and whenever I try to execute a process on more than one node, I get this error: $ mpirun -hostfile $HOSTFILE -n 1 hello_c orted: error while loading shared libraries: libimf.so: cannot open shared object file: No such file or directory ... followed by a whole bunch of timeout errors that I'm assuming were caused by the library error above. The cluster has 16 nodes and is running Ubuntu 8.04 Server. The Open MPI source was compiled with openib support using the Intel compilers: $ ./configure --prefix=/usr/local --with-openib=/usr/local/lib CC=icc CFLAGS=-m64 CXX=icpc CXXFLAGS=-m64 F77=ifort FFLAGS=-m64 FC=ifort FCFLAGS=-m64 I've installed the Intel compilers on the master node only, but I've installed them in the /usr/local directory, which is accessible to all nodes via NFS. Similarly, I've compiled / installed Open MPI only on the master node, but in the NFS-shared /usr/local directory as well. Finally, I've compiled / installed all of the OpenFabrics libraries on the master node only but in the NFS-shared /usr/local/lib directory. I've run the iccvars.sh and ifortvar.sh scripts on each node to ensure that the environment variables were setup for the Intel compilers on each node. Additionally, I've modified the LD_LIBRARY_PATH variable on each node to include /usr/local/lib and /usr/local/lib/openmpi so that each node can see the Infiniband and OpenMPI libraries. If I only execute Open MPI on the master node, it works fine $ mpirun -hostfile $HOSTFILE -n 1 hello_c Hello, world, I am 0 of 1 Sorry for the long post and thanks for your help in advance! --- Chris Tanner Space Systems Design Lab Georgia Institute of Technology christopher.tan...@gatech.edu ---
Re: [OMPI users] users Digest, Vol 1000, Issue 1
Jeremy - Thanks for the help - this bit of advice came up quite a bit through internet searches. However, I made sure that the LD_LIBRARY_PATH was set and correct on all nodes -- and the error persists. Any other possible solutions? Thanks. --- Chris Tanner Space Systems Design Lab Georgia Institute of Technology christopher.tan...@gatech.edu --- On Sep 9, 2008, at 12:00 PM, users-requ...@open-mpi.org wrote: The library you specified in your post (libimf.so) is part of the Intel Compiler Suite (fce and cce). You'll need to make those libraries available to your computation nodes and update the LD_LIBRARY_PATH accordingly. Jeremy Stout
Re: [OMPI users] users Digest, Vol 1000, Issue 1
Jeremy - I think I've found the problem / solution. With Ubuntu, there's a program called 'ldconfig' that updates the dynamic linker run-time bindings. Since Open MPI was compiled to use dynamic linking, these have to be updated. Thus, these commands have to be run on all of the nodes $ sudo ldconfig -v /usr/local/lib $ sudo ldconfig -v /usr/local/lib/openmpi When installing from an RPM (in RedHat) or installing for a dpkg (in Debain), this linking is done automatically at the end of the install. However, if you compile from source, you have to link it manually. Now Open MPI runs fine. :) --- Chris Tanner Space Systems Design Lab Georgia Institute of Technology christopher.tan...@gatech.edu --- The library you specified in your post (libimf.so) is part of the Intel Compiler Suite (fce and cce). You'll need to make those libraries available to your computation nodes and update the LD_LIBRARY_PATH accordingly. Jeremy Stout
[OMPI users] Default to TCP/IP?
I compiled the Open MPI source with openib support. However, the Infiniband part is still not working right (I had to build it from source since I'm using Ubuntu, and it's a mess). If I execute 'mpirun', I assume it will automatically look to communicate using Infiniband. However, since Infiniband is not working, will Open MPI default back to using the standard ethernet connection or will it just not work at all? Is there a way to tell Open MPI to do so in some configuration file? Thanks. --- Chris Tanner Space Systems Design Lab Georgia Institute of Technology christopher.tan...@gatech.edu ---