[OMPI users] problem abut openmpi running
hi, i m calin from indiai m working on openmpii have installed openmpi 1.1.1-tar.gz in four machines in our college labin one system the openmpi is properly working.i have written "hello world" program in all machines .but in one machine its working properly.in other machine gives (( (hello:error while loading shared libraries:libmpi.so..o;cannot open shared object file:no such file or directory.) what is the problem plz tel me..and how to solve it..please tell me calin pal india fergusson college msc.tech(maths and computer sc.)
Re: [OMPI users] problem abut openmpi running
Calin, Look like you're missing a proper value for the LD_LIBRARY_PATH. Please read the Open MPI FAW at http://www.open-mpi.org/faq/? category=running. Thanks, george. On Oct 19, 2006, at 6:41 AM, calin pal wrote: hi, i m calin from indiai m working on openmpii have installed openmpi 1.1.1-tar.gz in four machines in our college labin one system the openmpi is properly working.i have written "hello world" program in all machines .but in one machine its working properly.in other machine gives (( (hello:error while loading shared libraries:libmpi.so..o;cannot open shared object file:no such file or directory.) what is the problem plz tel me..and how to solve it..please tell me calin pal india fergusson college msc.tech(maths and computer sc.) ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] problem abut openmpi running
George I knew that was the answer to Calin's question, but I still would like to understand the issue: by default, the openMPI installer installs the libraries in /usr/local/lib, which is a standard location for the C compiler to look for libraries. So *why* do I need to explicitly specify this with LD_LIBRARY_PATH? For example, when I am compiling with pthread calls and pass -lpthread to gcc, I need not specify the location of libpthread.so with LD_LIBRARY_PATH. I had the same problem as Calin so I am curious. This is assuming he has not redirected the installation path to some non-standard location. Thanks Durga On 10/19/06, George Bosilca wrote: Calin, Look like you're missing a proper value for the LD_LIBRARY_PATH. Please read the Open MPI FAW at http://www.open-mpi.org/faq/? category=running. Thanks, george. On Oct 19, 2006, at 6:41 AM, calin pal wrote: > > hi, > i m calin from indiai m working on openmpii > have installed openmpi 1.1.1-tar.gz in four machines in our college > labin one system the openmpi is properly working.i have written > "hello world" program in all machines .but in one machine its > working properly.in other machine gives > (( > (hello:error while loading shared libraries:libmpi.so..o;cannot > open shared object file:no such file or directory.) > > > what is the problem plz tel me..and how to solve it..please > tell me > > calin pal > india > fergusson college > msc.tech(maths and computer sc.) > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Devil wanted omnipresence; He therefore created communists.
Re: [OMPI users] problem abut openmpi running
On a number of my Linux machines, /usr/local/lib is not searched by ldconfig, and hence, is not going to be found by gcc. You can fix this by adding /usr/local/lib to /etc/ld.so.conf and running ldconfig ( add the -v flag if you want to see the output ). -Justin. On 10/19/06, Durga Choudhury wrote: George I knew that was the answer to Calin's question, but I still would like to understand the issue: by default, the openMPI installer installs the libraries in /usr/local/lib, which is a standard location for the C compiler to look for libraries. So *why* do I need to explicitly specify this with LD_LIBRARY_PATH? For example, when I am compiling with pthread calls and pass -lpthread to gcc, I need not specify the location of libpthread.sowith LD_LIBRARY_PATH. I had the same problem as Calin so I am curious. This is assuming he has not redirected the installation path to some non-standard location. Thanks Durga On 10/19/06, George Bosilca wrote: > > Calin, > > Look like you're missing a proper value for the LD_LIBRARY_PATH. > Please read the Open MPI FAW at http://www.open-mpi.org/faq/? > category=running. > > Thanks, > george. > > On Oct 19, 2006, at 6:41 AM, calin pal wrote: > > > > > hi, > > i m calin from indiai m working on openmpii > > have installed openmpi 1.1.1-tar.gz in four machines in our college > > labin one system the openmpi is properly working.i have written > > "hello world" program in all machines .but in one machine its > > working properly.in other machine gives > > (( > > (hello:error while loading shared libraries:libmpi.so..o;cannot > > open shared object file:no such file or directory.) > > > > > > what is the problem plz tel me..and how to solve it..please > > tell me > > > > calin pal > > india > > fergusson college > > msc.tech(maths and computer sc.) > > > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Devil wanted omnipresence; He therefore created communists. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] problem abut openmpi running
Well, that help me realize that there are people who type a lot faster than me :) Thanks Justin for your accurate answer. The only thing here is that in this particular case is not gcc who does not find the library (the error does not happen at compile time) but the runtime linker. So /etc/ld.so.conf is definitively the problem. george. On Oct 19, 2006, at 11:30 AM, Justin Bronder wrote: On a number of my Linux machines, /usr/local/lib is not searched by ldconfig, and hence, is not going to be found by gcc. You can fix this by adding /usr/ local/lib to /etc/ld.so.conf and running ldconfig ( add the -v flag if you want to see the output ). -Justin. On 10/19/06, Durga Choudhury wrote: George I knew that was the answer to Calin's question, but I still would like to understand the issue: by default, the openMPI installer installs the libraries in /usr/ local/lib, which is a standard location for the C compiler to look for libraries. So *why* do I need to explicitly specify this with LD_LIBRARY_PATH? For example, when I am compiling with pthread calls and pass -lpthread to gcc, I need not specify the location of libpthread.so with LD_LIBRARY_PATH. I had the same problem as Calin so I am curious. This is assuming he has not redirected the installation path to some non-standard location. Thanks Durga On 10/19/06, George Bosilca wrote: Calin, Look like you're missing a proper value for the LD_LIBRARY_PATH. Please read the Open MPI FAW at http://www.open-mpi.org/faq/? category=running. Thanks, george. On Oct 19, 2006, at 6:41 AM, calin pal wrote: > > hi, > i m calin from indiai m working on openmpii > have installed openmpi 1.1.1-tar.gz in four machines in our college > labin one system the openmpi is properly working.i have written > "hello world" program in all machines .but in one machine its > working properly.in other machine gives > (( > (hello:error while loading shared libraries:libmpi.so..o ;cannot > open shared object file:no such file or directory.) > > > what is the problem plz tel me..and how to solve it..please > tell me > > calin pal > india > fergusson college > msc.tech(maths and computer sc.) > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Devil wanted omnipresence; He therefore created communists. ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] configure script not hapy with OpenPBS
Hi, When I tried to install OpenMPI on the front node of a cluster using OpenPBS batch system (e.g. --with-tm=/usr/open-pbs argument to configure), it didn't work and I got the error message: --- MCA component pls:tm (m4 configuration macro) checking for MCA component pls:tm compile mode... dso checking tm.h usability... yes checking tm.h presence... yes checking for tm.h... yes looking for library in lib checking for tm_init in -lpbs... no looking for library in lib64 checking for tm_init in -lpbs... no checking tm.h usability... yes checking tm.h presence... yes checking for tm.h... yes looking for library in lib checking for tm_finalize in -ltorque... no looking for library in lib64 checking for tm_finalize in -ltorque... no configure: error: TM support requested but not found. Aborting By looking in the very long configure script I found two typo errors in variable name: "ompi_check_tm_hapy" is set at lines 68164 and 76084 "ompi_check_loadleveler_hapy" is set at line 73086 where the correct names are obviously "ompi_check_tm_happy" and "ompi_check_loadleveler_happy" (e.g. "happy" not "hapy") when looking to the variables used arround. I corrected the variables names but unfortunately it didn't fixed my problem, configure stoped with the same error message (maybe you should also correct it in your "svn" repository since this may be a "latent" bug). I'm now questionning why didn't the configuration script found the 'tm_init' symbol in libpbs.a since the following command: nm /usr/open-pbs/lib/libpbs.a | grep -e '\' -e '\' prints: 0cd0 T tm_finalize 1270 T tm_init Is it possible that on an EM64T Linux system the configure script require that lib/libpbs.a or lib64/libpbs.a be a 64 bit library to be happy (lib64/libpbs.a doesn't exist and lib/libpbs.a is a 32 bit library on our system since the OpenPBS version we use is a bit old (2.3.x) and didn't appear to be 64 bit clean) ? Martin Audet
[OMPI users] Problem with PGI 6.1 and OpenMPI-1.1.1
Good afternoon, I really hate to post asking for help with a problem, but my own efforts have not worked out well (probably operator error). Anyway, I'm trying to run a code that was built with PGI 6.1 and OpenMPI-1.1.1. The mpirun command looks like: mpirun --hostfile machines.${PBS_JOBID} --np ${NP} -mca btl self,sm,tcp ./${EXE} ${CASEPROJ} >> OUTPUT I get the following error in the PBS error file: [o1:22559] mca_oob_tcp_accept: accept() failed with errno 9. ... and keeps repeating (for a long time). ompi_info gives the following output: > ompi_info Open MPI: 1.1.1 Open MPI SVN revision: r11473 Open RTE: 1.1.1 Open RTE SVN revision: r11473 OPAL: 1.1.1 OPAL SVN revision: r11473 Prefix: /usr/x86_64-pgi-6.1/openmpi-1.1.1 Configured architecture: x86_64-suse-linux-gnu Configured by: root Configured on: Mon Oct 16 20:51:34 MDT 2006 Configure host: lo248 Built by: root Built on: Mon Oct 16 21:02:00 MDT 2006 Built host: lo248 C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: pgcc C compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgcc C++ compiler: pgCC C++ compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgCC Fortran77 compiler: pgf77 Fortran77 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf77 Fortran90 compiler: pgf90 Fortran90 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf90 C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: yes Thread support: posix (mpi: no, progress: no) Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1.1) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1.1) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1) MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1.1) MCA timer: linux (MCA v1.0, API v1.0, Component v1.1.1) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1) MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1) MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1) MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0) MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1) MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1) MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1) MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA ns: replica (MCA v1.0, API v1.0, Component v1.1.1) MCA oob: tcp (MCA v1.0, API v1.0, Component v1.0) MCA ras: dash_host (MCA v1.0, API v1.0, Component v1.1.1) MCA ras: hostfile (MCA v1.0, API v1.0, Component v1.1.1) MCA ras: localhost (MCA v1.0, API v1.0, Component v1.1.1) MCA rds: hostfile (MCA v1.0, API v1.0, Component v1.1.1) MCA rds: resfile (MCA v1.0, API v1.0, Component v1.1.1) MCA rmaps: round_robin (MCA v1.0, API v1.0, Component v1.1.1) MCA rmgr: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA rmgr: urm (MCA v1.0, API v1.0, Component v1.1.1) MCA rml: oob (MCA v1.0, API v1.0, Component v1.1.1) MCA pls: fork (MCA v1.0, API v1.0, Component v1.1.1) MCA pls: rsh (MCA v1.0, API v1.0, Component v1.1.1) MCA sds: env (MCA v1.0, API
Re: [OMPI users] Problem with PGI 6.1 and OpenMPI-1.1.1
A small update. I was looking through the error file a bit more (it was 159MB). I found the following error message sequence: o1:22805] mca_oob_tcp_accept: accept() failed with errno 9. [o4:11242] [0,1,4]-[0,0,0] mca_oob_tcp_peer_recv_blocking: recv() failed with errno=104 [o1:22805] mca_oob_tcp_accept: accept() failed with errno 9. ... [o1:22805] mca_oob_tcp_accept: accept() failed with errno 9. [o3:32205] [0,1,2]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection failed (errno=111) - retrying (pid=32205) [o1:22805] mca_oob_tcp_accept: accept() failed with errno 9. [o3:32206] [0,1,3]-[0,0,0] mca_oob_tcp_peer_complete_connect: connection failed (errno=111) - retrying (pid=32206) [o1:22805] mca_oob_tcp_accept: accept() failed with errno 9. ... I don't know if this changes things (my google attempts didn't really give me much information). Jeff Good afternoon, I really hate to post asking for help with a problem, but my own efforts have not worked out well (probably operator error). Anyway, I'm trying to run a code that was built with PGI 6.1 and OpenMPI-1.1.1. The mpirun command looks like: mpirun --hostfile machines.${PBS_JOBID} --np ${NP} -mca btl self,sm,tcp ./${EXE} ${CASEPROJ} >> OUTPUT I get the following error in the PBS error file: [o1:22559] mca_oob_tcp_accept: accept() failed with errno 9. ... and keeps repeating (for a long time). ompi_info gives the following output: > ompi_info Open MPI: 1.1.1 Open MPI SVN revision: r11473 Open RTE: 1.1.1 Open RTE SVN revision: r11473 OPAL: 1.1.1 OPAL SVN revision: r11473 Prefix: /usr/x86_64-pgi-6.1/openmpi-1.1.1 Configured architecture: x86_64-suse-linux-gnu Configured by: root Configured on: Mon Oct 16 20:51:34 MDT 2006 Configure host: lo248 Built by: root Built on: Mon Oct 16 21:02:00 MDT 2006 Built host: lo248 C bindings: yes C++ bindings: yes Fortran77 bindings: yes (all) Fortran90 bindings: yes Fortran90 bindings size: small C compiler: pgcc C compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgcc C++ compiler: pgCC C++ compiler absolute: /opt/pgi/linux86-64/6.1/bin/pgCC Fortran77 compiler: pgf77 Fortran77 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf77 Fortran90 compiler: pgf90 Fortran90 compiler abs: /opt/pgi/linux86-64/6.1/bin/pgf90 C profiling: yes C++ profiling: yes Fortran77 profiling: yes Fortran90 profiling: yes C++ exceptions: yes Thread support: posix (mpi: no, progress: no) Internal debug support: no MPI parameter check: runtime Memory profiling support: no Memory debugging support: no libltdl support: yes MCA memory: ptmalloc2 (MCA v1.0, API v1.0, Component v1.1.1) MCA paffinity: linux (MCA v1.0, API v1.0, Component v1.1.1) MCA maffinity: first_use (MCA v1.0, API v1.0, Component v1.1.1) MCA maffinity: libnuma (MCA v1.0, API v1.0, Component v1.1.1) MCA timer: linux (MCA v1.0, API v1.0, Component v1.1.1) MCA allocator: basic (MCA v1.0, API v1.0, Component v1.0) MCA allocator: bucket (MCA v1.0, API v1.0, Component v1.0) MCA coll: basic (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: hierarch (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: self (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA coll: tuned (MCA v1.0, API v1.0, Component v1.1.1) MCA io: romio (MCA v1.0, API v1.0, Component v1.1.1) MCA mpool: gm (MCA v1.0, API v1.0, Component v1.1.1) MCA mpool: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA pml: ob1 (MCA v1.0, API v1.0, Component v1.1.1) MCA bml: r2 (MCA v1.0, API v1.0, Component v1.1.1) MCA rcache: rb (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: gm (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: self (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: sm (MCA v1.0, API v1.0, Component v1.1.1) MCA btl: tcp (MCA v1.0, API v1.0, Component v1.0) MCA topo: unity (MCA v1.0, API v1.0, Component v1.1.1) MCA osc: pt2pt (MCA v1.0, API v1.0, Component v1.0) MCA gpr: null (MCA v1.0, API v1.0, Component v1.1.1) MCA gpr: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA gpr: replica (MCA v1.0, API v1.0, Component v1.1.1) MCA iof: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA iof: svc (MCA v1.0, API v1.0, Component v1.1.1) MCA ns: proxy (MCA v1.0, API v1.0, Component v1.1.1) MCA ns: re
Re: [OMPI users] configure script not hapy with OpenPBS
Hi Martin, Yeah, we appear to have some mistakes in the configuration macros. I will correct them, but they really should not be effecting things in this instance. Whether Open MPI expects a 32 bit or 64 bit library depends on the compiler. If your compiler generates 64 bit executables by default, we will by default compile Open MPI in 64 bit mode and expect 64 bit libraries. Unfortunately there is no single simple flag to switch between 64 bit and 32 bit mode. With gcc I use the follow configure line to compile in 32 bit mode: ./configure FCFLAGS=-m32 FFLAGS=-m32 CFLAGS=-m32 CXXFLAGS=-m32 --with- wrapper-cflags=-m32 --with-wrapper-cxxflags=-m32 --with-wrapper- fflags=-m32 --with-wrapper-fcflags=-m32 I know that is a bit unwieldy, but I believe that is the best way to do it right now. If this does not work, please send the information requested here: http://www.open-mpi.org/community/help/ Thanks, Tim On Oct 19, 2006, at 2:48 PM, Audet, Martin wrote: Hi, When I tried to install OpenMPI on the front node of a cluster using OpenPBS batch system (e.g. --with-tm=/usr/open-pbs argument to configure), it didn't work and I got the error message: --- MCA component pls:tm (m4 configuration macro) checking for MCA component pls:tm compile mode... dso checking tm.h usability... yes checking tm.h presence... yes checking for tm.h... yes looking for library in lib checking for tm_init in -lpbs... no looking for library in lib64 checking for tm_init in -lpbs... no checking tm.h usability... yes checking tm.h presence... yes checking for tm.h... yes looking for library in lib checking for tm_finalize in -ltorque... no looking for library in lib64 checking for tm_finalize in -ltorque... no configure: error: TM support requested but not found. Aborting By looking in the very long configure script I found two typo errors in variable name: "ompi_check_tm_hapy" is set at lines 68164 and 76084 "ompi_check_loadleveler_hapy" is set at line 73086 where the correct names are obviously "ompi_check_tm_happy" and "ompi_check_loadleveler_happy" (e.g. "happy" not "hapy") when looking to the variables used arround. I corrected the variables names but unfortunately it didn't fixed my problem, configure stoped with the same error message (maybe you should also correct it in your "svn" repository since this may be a "latent" bug). I'm now questionning why didn't the configuration script found the 'tm_init' symbol in libpbs.a since the following command: nm /usr/open-pbs/lib/libpbs.a | grep -e '\' -e '\' prints: 0cd0 T tm_finalize 1270 T tm_init Is it possible that on an EM64T Linux system the configure script require that lib/libpbs.a or lib64/libpbs.a be a 64 bit library to be happy (lib64/libpbs.a doesn't exist and lib/libpbs.a is a 32 bit library on our system since the OpenPBS version we use is a bit old (2.3.x) and didn't appear to be 64 bit clean) ? Martin Audet ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users