[OMPI users] Run-time problem
Hi: When I execute something like mpirun -machinefile machinefile my_mpi_executable I get something like this my_mpi_executable symbol lookup error: remote_openmpi/lib/libmpi_cxx.so.0: undefined symbol: ompi_registered_datareps where both my_mpi_executable and remote_openmpi are installed on NSF mounted locations. Any clue? thanks JO
Re: [OMPI users] Run-time problem
Please let me go over it again, and maybe it helps clarifying things a bit better. All the OS involved are Suse 10.3. I have a place for the the installed programs, say /programs. In /programs I have installed openmpi and my mpi program, say my_mpi_program. When I am in the working directory, my LD_LIBRARY_PATH does include both /programs/my_mpi_program/lib /programs/openmpi/lib And my PATH includes /programs/my_mpi_program/bin /programs/openmpi/bin So, then I do mpirun -machinefile machinefile -np 20 my_mpi_program and I get /programs/my_mpi_program: symbol lookup error: /programs/openmpi/lib/libmpi_cxx.so.0: undefined symbol: ompi_registered_datareps When I configured openmpi, I did ./configure --prefix=/programs/openmpi and then compiled it. Subsequently, I compiled my_mpi_program with the options: MPI_CXX=/programs/openmpi/bin/mpicxx MPI_CC=/programs/openmpi/bin/mpicc MPI_INCLUDE=/programs/openmpi/include/ MPI_LIB=mpi MPI_LIBDIR=/programs/openmpi/lib/ MPI_LINKERFORPROGRAMS=/programs/openmpi/bin/mpicxx Any clue? The directory /programs is NSF mounted on the nodes. Many thanks again, JO --- On Thu, 3/5/09, justin oppenheim wrote: From: justin oppenheim Subject: Re: [OMPI users] Run-time problem To: "Ralph Castain" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 5:28 PM Hi Ralph: Sorry for my ignorance, but in you option 2: what command should I add the option --prefix=path-to-install? when I configure openmpi? I already did that when I configured and compiled openmpi. Also, in response to your option 1, I did add the paths to libraries of openmpi in the LD_LIBRARY_PATH in the .cshrc of the nodes. Thank you, JO --- On Thu, 3/5/09, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com Cc: "Open MPI Users " List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 12:46 PM First, you can add --launch-agent rsh to the command line and that will have OMPI use rsh. It sounds like your remote nodes may not be seeing your OMPI install directory. Several ways you can resolve that - here are a couple: 1. add the install directory to your LD_LIBRARY_PATH in your .cshrc (or whatever shell rc you are using) - be sure this is being executed on the remote nodes 2. add --prefix=path-to-install on your cmd line - this will direct your remote procs to the proper libraries Ralph On Mar 5, 2009, at 10:18 AM, justin oppenheim wrote: Maybe I should also add that the program my_mpi_executable is locally installed under the same root directory as that under which openmpi-1.3 is installed. This root directory is NSF mounted on the working nodes. Thanks, JO --- On Thu, 3/5/09, justin oppenheim wrote: From: justin oppenheim Subject: Re: [OMPI users] Run-time problem To: "Ralph Castain" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 12:04 PM Hi Ralph: Thanks for your prompt response. I am using openmpi-1.3, Suse 10.3. I installed openmpi-1.3 with the option ./configure --prefix=/where/to/install and then just make all install I thought the default connection mode is rsh, but I had to invoke ssh-agent, in order not have to enter password one by one. How to change to rsh? Thanks, JO --- On Thu, 3/5/09, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com, "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 11:40 AM Could you tell us what version of Open MPI you are using, a little about your system (I would assume you are using ssh?), and how this was configured? ThanksRalph On Mar 5, 2009, at 9:31 AM, justin oppenheim wrote: Hi: When I execute something like mpirun -machinefile machinefile my_mpi_executable I get something like this my_mpi_executable symbol lookup error: remote_openmpi/lib/libmpi_cxx.so.0: undefined symbol: ompi_registered_datareps where both my_mpi_executable and remote_openmpi are installed on NSF mounted locations. Any clue? thanks JO ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Run-time problem
Yes. As I indicated earlier, I did use these options to compile my program MPI_CXX=/programs/openmpi/bin/mpicxx MPI_CC=/programs/openmpi/bin/mpicc MPI_INCLUDE=/programs/openmpi/include/ MPI_LIB=mpi /programs/openmpi/ MPI_LIBDIR=/programs/openmpi/lib/ MPI_LINKERFORPROGRAMS=/programs/openmpi/bin/mpicxx where /programs/openmpi/ is the chosen location for installing the openmpi package (specifically, openmpi-1.3.tar.gz) that I downloaded from www.open-mpi.org. Any clue? Again, my system is Suse 10.3 64-bit, which should be pretty standard. Would another package openmpi-1.3-1.src.rpm work better for my system? Thanks, JO --- On Mon, 3/9/09, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com Cc: us...@open-mpi.org List-Post: users@lists.open-mpi.org Date: Monday, March 9, 2009, 7:59 AM Did you try compiling your program with the provided mpicc (or mpiCC, mpif90, etc. - as appropriate) wrapper compiler? The wrapper compilers contain all the required library definitions to make the application work. Compiling without the wrapper compilers is a very bad idea... Ralph On Mar 6, 2009, at 11:02 AM, justin oppenheim wrote: Please let me go over it again, and maybe it helps clarifying things a bit better. All the OS involved are Suse 10.3. I have a place for the the installed programs, say /programs. In /programs I have installed openmpi and my mpi program, say my_mpi_program. When I am in the working directory, my LD_LIBRARY_PATH does include both /programs/my_mpi_program/lib /programs/openmpi/lib And my PATH includes /programs/my_mpi_program/bin /programs/openmpi/bin So, then I do mpirun -machinefile machinefile -np 20 my_mpi_program and I get /programs/my_mpi_program: symbol lookup error: /programs/openmpi/lib/libmpi_cxx.so.0: undefined symbol: ompi_registered_datareps When I configured openmpi, I did ./configure --prefix=/programs/openmpi and then compiled it. Subsequently, I compiled my_mpi_program with the options: MPI_CXX=/programs/openmpi/bin/mpicxx MPI_CC=/programs/openmpi/bin/mpicc MPI_INCLUDE=/programs/openmpi/include/ MPI_LIB=mpi MPI_LIBDIR=/programs/openmpi/lib/ MPI_LINKERFORPROGRAMS=/programs/openmpi/bin/mpicxx Any clue? The directory /programs is NSF mounted on the nodes. Many thanks again, JO --- On Thu, 3/5/09, justin oppenheim wrote: From: justin oppenheim Subject: Re: [OMPI users] Run-time problem To: "Ralph Castain" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 5:28 PM Hi Ralph: Sorry for my ignorance, but in you option 2: what command should I add the option --prefix=path-to-install? when I configure openmpi? I already did that when I configured and compiled openmpi. Also, in response to your option 1, I did add the paths to libraries of openmpi in the LD_LIBRARY_PATH in the .cshrc of the nodes. Thank you, JO --- On Thu, 3/5/09, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com Cc: "Open MPI Users " List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 12:46 PM First, you can add --launch-agent rsh to the command line and that will have OMPI use rsh. It sounds like your remote nodes may not be seeing your OMPI install directory. Several ways you can resolve that - here are a couple: 1. add the install directory to your LD_LIBRARY_PATH in your .cshrc (or whatever shell rc you are using) - be sure this is being executed on the remote nodes 2. add --prefix=path-to-install on your cmd line - this will direct your remote procs to the proper libraries Ralph On Mar 5, 2009, at 10:18 AM, justin oppenheim wrote: Maybe I should also add that the program my_mpi_executable is locally installed under the same root directory as that under which openmpi-1.3 is installed. This root directory is NSF mounted on the working nodes. Thanks, JO --- On Thu, 3/5/09, justin oppenheim wrote: From: justin oppenheim Subject: Re: [OMPI users] Run-time problem To: "Ralph Castain" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 12:04 PM Hi Ralph: Thanks for your prompt response. I am using openmpi-1.3, Suse 10.3. I installed openmpi-1.3 with the option ./configure --prefix=/where/to/install and then just make all install I thought the default connection mode is rsh, but I had to invoke ssh-agent, in order not have to enter password one by one. How to change to rsh? Thanks, JO --- On Thu, 3/5/09, Ralph Castain wrote: From: Ralph Castain Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com, "Open MPI Users" List-Post: users@lists.open-mpi.org Date: Thursday, March 5, 2009, 11:40 AM Could you tell us what version of Open MPI you are using, a little about your system (I would assume you are using ssh?), and how this was configured? ThanksRalph On Mar 5, 2009, at 9:31 AM, justin oppenheim wrote: Hi
Re: [OMPI users] Run-time problem
Hi Jeff: I managed to run it just recently... It turns out that some libraries libib* were missing, as well as some others. I learned this by trying to install an old version of openmpi that was in the repository of my Suse Linux. The "software manager" of Suse told me the missing libraries for the old openmpi. After installing these libraries, the already installed new openmpi (downloaded from open-mpi.org) works. Maybe it is a good idea to spell this out on open-mpi web site. People would just install the openmpi without knowing that there might be some missing libraries... Thanks! JO --- On Sat, 3/14/09, Jeff Squyres wrote: From: Jeff Squyres Subject: Re: [OMPI users] Run-time problem To: jl09...@yahoo.com, "Open MPI Users" Cc: "Ralph Castain" List-Post: users@lists.open-mpi.org Date: Saturday, March 14, 2009, 9:15 AM Sorry for the delay in replying; this week unexpectedly turned exceptionally hectic for several us... On Mar 9, 2009, at 2:53 PM, justin oppenheim wrote: > Yes. As I indicated earlier, I did use these options to compile my program > > MPI_CXX=/programs/openmpi/bin/mpicxx > MPI_CC=/programs/openmpi/bin/mpicc > MPI_INCLUDE=/programs/openmpi/include/ > MPI_LIB=mpi /programs/openmpi/ > MPI_LIBDIR=/programs/openmpi/lib/ MPI_LINKERFORPROGRAMS=/programs/openmpi/bin/mpicxx Ah; I think Ralph was asking because we don't know exactly how these ?environment variables? are being used to build your application. > where /programs/openmpi/ is the chosen location for installing the openmpi package (specifically, openmpi-1.3.tar.gz) that I downloaded from www.open-mpi.org. Can you ensure that you have exactly the same version of Open MPI installed on all nodes in exactly the same location in the filesystem (it doesn't *have* to be the same location on the filesystem on all the nodes, but it sure is easier if it is). Also be sure that when you mpirun across multiple nodes that the same version of Open MPI (both executables and libraries) are being found on all nodes. > > Any clue? Again, my system is Suse 10.3 64-bit, which should be pretty standard. Would another package openmpi-1.3-1.src.rpm work better for my system? > > Thanks, > > JO > > > > > > --- On Mon, 3/9/09, Ralph Castain wrote: > From: Ralph Castain > Subject: Re: [OMPI users] Run-time problem > To: jl09...@yahoo.com > Cc: us...@open-mpi.org > Date: Monday, March 9, 2009, 7:59 AM > > Did you try compiling your program with the provided mpicc (or mpiCC, mpif90, etc. - as appropriate) wrapper compiler? The wrapper compilers contain all the required library definitions to make the application work. > > Compiling without the wrapper compilers is a very bad idea... > > Ralph > > > On Mar 6, 2009, at 11:02 AM, justin oppenheim wrote: > >> Please let me go over it again, and maybe it helps clarifying things a bit better. All the OS involved are Suse 10.3. >> >> I have a place for the the installed programs, say /programs. >> >> In /programs I have installed openmpi and my mpi program, say my_mpi_program. When I am in the working directory, my LD_LIBRARY_PATH does include both >> >> /programs/my_mpi_program/lib >> /programs/openmpi/lib >> >> And my PATH includes >> /programs/my_mpi_program/bin >> /programs/openmpi/bin >> >> So, then I do >> >> mpirun -machinefile machinefile -np 20 my_mpi_program >> >> and I get >> >> /programs/my_mpi_program: symbol lookup error: /programs/openmpi/lib/libmpi_cxx.so.0: undefined symbol: ompi_registered_datareps >> >> When I configured openmpi, I did >> >> ./configure --prefix=/programs/openmpi >> >> and then compiled it. Subsequently, I compiled my_mpi_program with the options: >> >> MPI_CXX=/programs/openmpi/bin/mpicxx >> MPI_CC=/programs/openmpi/bin/mpicc >> MPI_INCLUDE=/programs/openmpi/include/ >> MPI_LIB=mpi >> MPI_LIBDIR=/programs/openmpi/lib/ MPI_LINKERFORPROGRAMS=/programs/openmpi/bin/mpicxx >> >> Any clue? The directory /programs is NSF mounted on the nodes. >> >> Many thanks again, >> >> JO >> >> >> >> >> >> >> >> >> >> >> --- On Thu, 3/5/09, justin oppenheim wrote: >> From: justin oppenheim >> Subject: Re: [OMPI users] Run-time problem >> To: "Ralph Castain" >> Date: Thursday, March 5, 2009, 5:28 PM >> >> Hi Ralph: >> >> Sorry for my ignorance, but in you option 2: what command should I add the option >> --prefix=path-to-install? when I configure openmpi? I already did that when I configured and compiled openmpi. Also, in response to