Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
Helllo Jeff, Gustavo, Mi Thank for the advice. I am familiar with the difference in the compiler code generation for C, C++ & FORTRAN. I even tried to look at some of the common block symbols. The name of the symbol remains the same. The only difference that I observe is in FORTRAN compiled *.o 00515bc0 B aux7loc_ and the C++ compiled code U aux7loc_ the memory is not allocated as it has been declared as extern in C++. When the executable loads the shared library it finds all the undefined symbols. Atleast if it did not manage to find a single symbol it prints undefined symbol error. I am completely stuck up and do not know how to continue further. Thanks, Rajesh From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Mi Yan Sent: samedi 1 novembre 2008 23:26 To: Open MPI Users Cc: 'Open MPI Users'; users-boun...@open-mpi.org Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) So your tests show: 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works. 2. "Shared library in C++ + MPI executable in FORTRAN " does not work. It seems to me that the symbols in C library are not really recognized by FORTRAN executable as you thought. What compilers did yo use to built OpenMPI? Different compiler has different convention to handle symbols. E.g. if there is a variable "var_foo" in your FORTRAN code, some FORTRN compiler will save "var_foo_" in the object file by default; if you want to access "var_foo" in C code, you actually need to refer "var_foo_" in C code. If you define "var_foo" in a module in the FORTAN compiler, some FORTRAN compiler may append the module name to "var_foo". So I suggest to check the symbols in the object files generated by your FORTAN and C compiler to see the difference. Mi Inactive hide details for "Rajesh Ramaya" "Rajesh Ramaya" "Rajesh Ramaya" Sent by: users-boun...@open-mpi.org 10/31/2008 03:07 PM Please respond to Open MPI Users To "'Open MPI Users'" , "'Jeff Squyres'" cc Subject Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) Hello Jeff Squyres, Thank you very much for the immediate reply. I am able to successfully access the data from the common block but the values are zero. In my algorithm I even update a common block but the update made by the shared library is not taken in to account by the executable. Can you please be very specific how to make the parallel algorithm aware of the data? Actually I am not writing any MPI code inside? It's the executable (third party software) who does that part. All that I am doing is to compile my code with MPI c compiler and add it in the LD_LIBIRARY_PATH. In fact I did a simple test by creating a shared library using a FORTRAN code and the update made to the common block is taken in to account by the executable. Is there any flag or pragma that need to be activated for mixed language MPI? Thank you once again for the reply. Rajesh -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: vendredi 31 octobre 2008 18:53 To: Open MPI Users Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote: > I am completely new to MPI. I have a basic question concerning > MPI and mixed language coding. I hope any of you could help me out. > Is it possible to access FORTRAN common blocks in C++ in a MPI > compiled code. It works without MPI but as soon I switch to MPI the > access of common block does not work anymore. > I have a Linux MPI executable which loads a shared library at > runtime and resolves all undefined symbols etc The shared library > is written in C++ and the MPI executable in written in FORTRAN. Some > of the input that the shared library looking for are in the Fortran > common blocks. As I access those common blocks during runtime the > values are not initialized. I would like to know if what I am > doing is possible ?I hope that my problem is clear.. Generally, MPI should not get in the way of sharing common blocks between Fortran and C/C++. Indeed, in Open MPI itself, we share a few common blocks between Fortran and the main C Open MPI implementation. What is the exact symptom that you are seeing? Is the application failing to resolve symbols at run-time, possibly indicating that something hasn't instantiated a common block? Or are you able to successfully access the data from the common block, but it doesn't have the values you expect (e.g., perhaps you're seeing all zeros)? If the former, you might want to check your build procedure. You *should* be able to simply replace your C++ / F90 compilers with mpicxx and mpif90, respectively, and be able to build an MPI version of your app. If the latter, you might need to make your parallel algorithm aware of what data is available in which MPI process
[OMPI users] Scyld Beowulf and openmpi
Hello! I am a new user of openmpi -- I've installed openmpi 1.2.6 for our x86_64 linux scyld beowulf cluster inorder to make it run with amber10 MD simulation package. The nodes can see the home directory i.e. a bpsh to the nodes works fine and lists all the files in the home directory where I have both openmpi and amber10 installed. However if I try to run: $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/sander.MPI I get the following error: [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 -- Failed to find the following executable: Host: helios.structure.uic.edu Executable: -o Cannot continue. -- [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not found in file rmgr_urm.c at line 462 [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13 any cues? -- -Rima
Re: [OMPI users] Scyld Beowulf and openmpi
For starters, there is no "-no_local" option to mpirun. You might want to look at mpirun --help, or man mpirun. I suspect the option you wanted was --nolocal. Note that --nolocal does not take an argument. Mpirun is confused by the incorrect option and looking for an incorrectly named executable. Ralph On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote: Hello! I am a new user of openmpi -- I've installed openmpi 1.2.6 for our x86_64 linux scyld beowulf cluster inorder to make it run with amber10 MD simulation package. The nodes can see the home directory i.e. a bpsh to the nodes works fine and lists all the files in the home directory where I have both openmpi and amber10 installed. However if I try to run: $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI I get the following error: [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 -- Failed to find the following executable: Host: helios.structure.uic.edu Executable: -o Cannot continue. -- [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not found in file rmgr_urm.c at line 462 [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13 any cues? -- -Rima ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
Hi everyone, Here's a "progress report"... more questions in the end :-) Finally, I was *almost* able to compile OpenMPI in Cygwin using the following configure command: ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \ --with-mpi-param_check=always --with-threads=posix \ --enable-mpi-threads --disable-io-romio \ --enable-mca-no-build=memory_mallopt,maffinity,paffinity \ --enable-contrib-no-build=vt \ FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++ I then had a very weird error during compilation of ompi/tools/ompi_info/params.cc. (See below). The lines causing the compilation errors are: vector.tcc:307: const size_type __len = __old_size + std::max(__old_size, __n); vector.tcc:384: const size_type __len = __old_size + std::max(__old_size, __n); stl_bvector.h:522: const size_type __len = size() + std::max(size(), __n); stl_bvector.h:823: const size_type __len = size() + std::max(size(), __n); (Notice that those are from the standard gcc libraries.) After googling it for a while, I could find that this error is caused because, at come point, the source code being compiled redefined the "max" function with a macro, g++ cannot recognize the "std::max" that happens in those lines and only "sees" a (...), thus printing that cryptic complaint. I looked in some places in the OpenMPI code, but I couldn't find "max" being redefined anywhere, but I may be looking in the wrong places. Anyways, the only way of found of compiling OpenMPI was a very ugly hack: I have to go into those files and remove the "std::" before the "max". With that, it all compiled cleanly. I did try running the tests in the 'tests' directory (with 'make check'), and I didn't get any alarming message, except that in some cases (class, threads, peruse) it printed "All 0 tests passed". I got and "All (n) tests passed" (n>0) for asm and datatype. Can anybody comment on the meaning of those test results? Should I be alarmed with the "All 0 tests passed" messages? Finally, in the absence of big red flags (that I noticed), I went ahead and tried to compile my program. However, as soon as compilation starts, I get the following: /local/openmpi/openmpi-1.3b1/bin/mpif90 -c -O3 -fno-second-underscore -ffree-form -o constants.o _constants.f -- Unfortunately, this installation of Open MPI was not compiled with Fortran 90 support. As such, the mpif90 compiler is non-functional. -- make[1]: *** [constants.o] Error 1 make[1]: Leaving directory `/home/seabra/local/amber11/src/sander' make: *** [parallel] Error 2 Notice that I compiled OpenMPI with g95, so there *should* be Fortran95 support... Any ideas on what could be going wrong? Thank you very much, Gustavo. == Error in the compilation of params.cc == $ g++ -DHAVE_CONFIG_H -I. -I../../../opal/include -I../../../orte/include -I../../../ompi/include -I../../../opal/mca/paffinity/linux/plpa/src/libplpa -DOMPI_CONFIGURE_USER="\"seabra\"" -DOMPI_CONFIGURE_HOST="\"ACS02\"" -DOMPI_CONFIGURE_DATE="\"Sat Nov 1 20:44:32 EDT 2008\"" -DOMPI_BUILD_USER="\"$USER\"" -DOMPI_BUILD_HOST="\"`hostname`\"" -DOMPI_BUILD_DATE="\"`date`\"" -DOMPI_BUILD_CFLAGS="\"-O3 -DNDEBUG -finline-functions -fno-strict-aliasing \"" -DOMPI_BUILD_CPPFLAGS="\"-I../../.. -D_REENTRANT\"" -DOMPI_BUILD_CXXFLAGS="\"-O3 -DNDEBUG -finline-functions \"" -DOMPI_BUILD_CXXCPPFLAGS="\"-I../../.. -D_REENTRANT\"" -DOMPI_BUILD_FFLAGS="\"-O0 -fno-second-underscore\"" -DOMPI_BUILD_FCFLAGS="\"\"" -DOMPI_BUILD_LDFLAGS="\"-export-dynamic \"" -DOMPI_BUILD_LIBS="\"-lutil \"" -DOMPI_CC_ABSOLUTE="\"/usr/bin/gcc\"" -DOMPI_CXX_ABSOLUTE="\"/usr/bin/g++\"" -DOMPI_F77_ABSOLUTE="\"/usr/bin/g77\"" -DOMPI_F90_ABSOLUTE="\"/usr/local/bin/g95\"" -DOMPI_F90_BUILD_SIZE="\"small\"" -I../../.. -D_REENTRANT -O3 -DNDEBUG -finline-functions -MT param.o -MD -MP -MF $depbase.Tpo -c -o param.o param.cc In file included from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/vector:72, from ../../../ompi/tools/ompi_info/ompi_info.h:24, from param.cc:43: /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In member function `void std::vector::_M_insert_range(std::_Bit_iterator, _ForwardIterator, _ForwardIterator, std::forward_iterator_tag)': /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:522: error: expected unqualified-id before '(' token /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h: In member function `void std::vector::_M_fill_insert(std::_Bit_iterator, size_t, bool)': /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits/stl_bvector.h:823: error: expected unqualified-id before '(' token In file included from /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/vector:75,
Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++)
Can you replicate the scenario in smaller / different cases? - write a sample plugin in C instead of C++ - write a non-MPI Fortran application that loads your C++ application - ...? In short, *MPI* shouldn't be interfering with Fortran/C++ common blocks. Try taking MPI out of the picture and see if that makes the problem go away. Those are pretty much shots in the dark, but I don't know where to go, either -- try random things until you find what you want. On Nov 3, 2008, at 3:51 AM, Rajesh Ramaya wrote: Helllo Jeff, Gustavo, Mi Thank for the advice. I am familiar with the difference in the compiler code generation for C, C++ & FORTRAN. I even tried to look at some of the common block symbols. The name of the symbol remains the same. The only difference that I observe is in FORTRAN compiled *.o 00515bc0 B aux7loc_ and the C++ compiled code U aux7loc_ the memory is not allocated as it has been declared as extern in C++. When the executable loads the shared library it finds all the undefined symbols. Atleast if it did not manage to find a single symbol it prints undefined symbol error. I am completely stuck up and do not know how to continue further. Thanks, Rajesh From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Mi Yan Sent: samedi 1 novembre 2008 23:26 To: Open MPI Users Cc: 'Open MPI Users'; users-boun...@open-mpi.org Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) So your tests show: 1. "Shared library in FORTRAN + MPI executable in FORTRAN" works. 2. "Shared library in C++ + MPI executable in FORTRAN " does not work. It seems to me that the symbols in C library are not really recognized by FORTRAN executable as you thought. What compilers did yo use to built OpenMPI? Different compiler has different convention to handle symbols. E.g. if there is a variable "var_foo" in your FORTRAN code, some FORTRN compiler will save "var_foo_" in the object file by default; if you want to access "var_foo" in C code, you actually need to refer "var_foo_" in C code. If you define "var_foo" in a module in the FORTAN compiler, some FORTRAN compiler may append the module name to "var_foo". So I suggest to check the symbols in the object files generated by your FORTAN and C compiler to see the difference. Mi "Rajesh Ramaya" "Rajesh Ramaya" Sent by: users-boun...@open-mpi.org 10/31/2008 03:07 PM Please respond to Open MPI Users To "'Open MPI Users'" , "'Jeff Squyres'" > cc Subject Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) Hello Jeff Squyres, Thank you very much for the immediate reply. I am able to successfully access the data from the common block but the values are zero. In my algorithm I even update a common block but the update made by the shared library is not taken in to account by the executable. Can you please be very specific how to make the parallel algorithm aware of the data? Actually I am not writing any MPI code inside? It's the executable (third party software) who does that part. All that I am doing is to compile my code with MPI c compiler and add it in the LD_LIBIRARY_PATH. In fact I did a simple test by creating a shared library using a FORTRAN code and the update made to the common block is taken in to account by the executable. Is there any flag or pragma that need to be activated for mixed language MPI? Thank you once again for the reply. Rajesh -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: vendredi 31 octobre 2008 18:53 To: Open MPI Users Subject: Re: [OMPI users] MPI + Mixed language coding(Fortran90 + C++) On Oct 31, 2008, at 11:57 AM, Rajesh Ramaya wrote: > I am completely new to MPI. I have a basic question concerning > MPI and mixed language coding. I hope any of you could help me out. > Is it possible to access FORTRAN common blocks in C++ in a MPI > compiled code. It works without MPI but as soon I switch to MPI the > access of common block does not work anymore. > I have a Linux MPI executable which loads a shared library at > runtime and resolves all undefined symbols etc The shared library > is written in C++ and the MPI executable in written in FORTRAN. Some > of the input that the shared library looking for are in the Fortran > common blocks. As I access those common blocks during runtime the > values are not initialized. I would like to know if what I am > doing is possible ?I hope that my problem is clear.. Generally, MPI should not get in the way of sharing common blocks between Fortran and C/C++. Indeed, in Open MPI itself, we share a few common blocks between Fortran and the main C Open MPI implementation. What is the exact symptom that you are seeing? Is the application failing to resolve symbols at run-time, possibly indicating that something hasn't instantiated a common block? Or are you
Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
On Nov 3, 2008, at 2:53 PM, Gustavo Seabra wrote: Finally, I was *almost* able to compile OpenMPI in Cygwin using the following configure command: ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \ --with-mpi-param_check=always --with-threads=posix \ --enable-mpi-threads --disable-io-romio \ --enable-mca-no- build=memory_mallopt,maffinity,paffinity \ --enable-contrib-no-build=vt \ FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++ For your fortran issue, the Fortran 90 interface needs the Fortran 77 interface. So you need to supply an F77 as well (the output from configure should indicate that the F90 interface was disabled because the F77 interface was disabled). I then had a very weird error during compilation of ompi/tools/ompi_info/params.cc. (See below). The lines causing the compilation errors are: vector.tcc:307: const size_type __len = __old_size + std::max(__old_size, __n); vector.tcc:384: const size_type __len = __old_size + std::max(__old_size, __n); stl_bvector.h:522: const size_type __len = size() + std::max(size(), __n); stl_bvector.h:823: const size_type __len = size() + std::max(size(), __n); (Notice that those are from the standard gcc libraries.) After googling it for a while, I could find that this error is caused because, at come point, the source code being compiled redefined the "max" function with a macro, g++ cannot recognize the "std::max" that happens in those lines and only "sees" a (...), thus printing that cryptic complaint. I looked in some places in the OpenMPI code, but I couldn't find "max" being redefined anywhere, but I may be looking in the wrong places. Anyways, the only way of found of compiling OpenMPI was a very ugly hack: I have to go into those files and remove the "std::" before the "max". With that, it all compiled cleanly. I'm not sure I follow -- I don't see anywhere in OMPI where we use std::max. What areas did you find that you needed to change? I did try running the tests in the 'tests' directory (with 'make check'), and I didn't get any alarming message, except that in some cases (class, threads, peruse) it printed "All 0 tests passed". I got and "All (n) tests passed" (n>0) for asm and datatype. Can anybody comment on the meaning of those test results? Should I be alarmed with the "All 0 tests passed" messages? No. We don't really maintain the "make check" stuff too well. -- Jeff Squyres Cisco Systems
Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
On Mon, Nov 3, 2008 at 3:04 PM, Jeff Squyres wrote: > On Nov 3, 2008, at 2:53 PM, Gustavo Seabra wrote: > >> Finally, I was *almost* able to compile OpenMPI in Cygwin using the >> following configure command: >> >> ./configure --prefix=/home/seabra/local/openmpi-1.3b1 \ >> --with-mpi-param_check=always --with-threads=posix \ >> --enable-mpi-threads --disable-io-romio \ >> --enable-mca-no-build=memory_mallopt,maffinity,paffinity \ >> --enable-contrib-no-build=vt \ >> FC=g95 'FFLAGS=-O0 -fno-second-underscore' CXX=g++ > > For your fortran issue, the Fortran 90 interface needs the Fortran 77 > interface. So you need to supply an F77 as well (the output from configure > should indicate that the F90 interface was disabled because the F77 > interface was disabled). Is that what you mean (see below)? I thought the g95 compiler could deal with F77 as well as F95... If so, could I just pass F77='g95'? (From the configure output) *** Fortran 90/95 compiler checking whether we are using the GNU Fortran compiler... yes checking whether g95 accepts -g... yes checking if Fortran compiler works... yes checking whether g77 and g95 compilers are compatible... no configure: WARNING: *** Fortran 77 and Fortran 90 compilers are not link compatible configure: WARNING: *** Disabling MPI Fortran 90/95 bindings checking for extra arguments to build a shard library... none needed checking to see if F90 compiler likes the C++ exception flags... skipped (no F90 bindings) > >> I then had a very weird error during compilation of >> ompi/tools/ompi_info/params.cc. (See below). >> >> The lines causing the compilation errors are: >> >> vector.tcc:307: const size_type __len = __old_size + >> std::max(__old_size, __n); >> vector.tcc:384: const size_type __len = __old_size + >> std::max(__old_size, __n); >> stl_bvector.h:522: const size_type __len = size() + std::max(size(), >> __n); >> stl_bvector.h:823: const size_type __len = size() + std::max(size(), >> __n); >> >> (Notice that those are from the standard gcc libraries.) >> >> After googling it for a while, I could find that this error is caused >> because, at come point, the source code being compiled redefined the >> "max" function with a macro, g++ cannot recognize the "std::max" that >> happens in those lines and only "sees" a (...), thus printing that >> cryptic complaint. >> >> I looked in some places in the OpenMPI code, but I couldn't find >> "max" being redefined anywhere, but I may be looking in the wrong >> places. Anyways, the only way of found of compiling OpenMPI was a very >> ugly hack: I have to go into those files and remove the "std::" before >> the "max". With that, it all compiled cleanly. > > I'm not sure I follow -- I don't see anywhere in OMPI where we use std::max. > What areas did you find that you needed to change? These files are part of the standard C++ headers. In my case, they sit in: /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits In principle, the problems that comes from those files would mean that the OpenMPI source has some macro redefining max, but that's what I could not find :-( >> I did try running the tests in the 'tests' directory (with 'make >> check'), and I didn't get any alarming message, except that in some >> cases (class, threads, peruse) it printed "All 0 tests passed". I got >> and "All (n) tests passed" (n>0) for asm and datatype. >> >> Can anybody comment on the meaning of those test results? Should I be >> alarmed with the "All 0 tests passed" messages? > > No. We don't really maintain the "make check" stuff too well. Oh well... What do you use for testing the implementation? Thanks a lot! Gustavo.
Re: [OMPI users] users Digest, Vol 1055, Issue 2
Thanks a lot Ralph! I corrected the no_local to nolocal and now when I try to execute the script step1 (pls find it attached) [rchaud@helios amber10]$ ./step1 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 -- There are no available nodes allocated to this job. This could be because no nodes were found or all the available nodes were already used. Note that since the -nolocal option was given no processes can be launched on the local node. -- [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_support_fns.c at line 168 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmaps_rr.c at line 402 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_map_job.c at line 210 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmgr_urm.c at line 372 [helios.structure.uic.edu:16335] mpirun: spawn failed with errno=-3 If I use the script without the --nolocal option, I get the following error: [helios.structure.uic.edu:20708] [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 thanks, On Mon, Nov 3, 2008 at 2:04 PM, wrote: > Send users mailing list submissions to >us...@open-mpi.org > > To subscribe or unsubscribe via the World Wide Web, visit >http://www.open-mpi.org/mailman/listinfo.cgi/users > or, via email, send a message with subject or body 'help' to >users-requ...@open-mpi.org > > You can reach the person managing the list at >users-ow...@open-mpi.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of users digest..." > > > Today's Topics: > > 1. Scyld Beowulf and openmpi (Rima Chaudhuri) > 2. Re: Scyld Beowulf and openmpi (Ralph Castain) > 3. Problems installing in Cygwin - Problem with GCC 3.4.4 > (Gustavo Seabra) > 4. Re: MPI + Mixed language coding(Fortran90 + C++) (Jeff Squyres) > 5. Re: Problems installing in Cygwin - Problem with GCC 3.4.4 > (Jeff Squyres) > > > -- > > Message: 1 > Date: Mon, 3 Nov 2008 11:30:01 -0600 > From: "Rima Chaudhuri" > Subject: [OMPI users] Scyld Beowulf and openmpi > To: us...@open-mpi.org > Message-ID: ><7503b17d0811030930i13acb974kc627983a1d481...@mail.gmail.com> > Content-Type: text/plain; charset=ISO-8859-1 > > Hello! > I am a new user of openmpi -- I've installed openmpi 1.2.6 for our > x86_64 linux scyld beowulf cluster inorder to make it run with amber10 > MD simulation package. > > The nodes can see the home directory i.e. a bpsh to the nodes works > fine and lists all the files in the home directory where I have both > openmpi and amber10 installed. > However if I try to run: > > $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/sander.MPI > > I get the following error: > [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 > -- > Failed to find the following executable: > > Host: helios.structure.uic.edu > Executable: -o > > Cannot continue. > -- > [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not found in > file rmgr_urm.c at line 462 > [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13 > > any cues? > > > -- > -Rima > > > -- > > Message: 2 > Date: Mon, 3 Nov 2008 12:08:36 -0700 > From: Ralph Castain > Subject: Re: [OMPI users] Scyld Beowulf and openmpi > To: Open MPI Users > Message-ID: <91044a7e-ada5-4b94-aa11-b3c1d9843...@lanl.gov> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > For starters, there is no "-no_local" option to mpirun. You might want > to look at mpirun --help, or man mpirun. > > I suspect the option you wanted was --nolocal. Note that --nolocal > does not take an argument. > > Mpirun is confused by the incorrect option and looking for an > incorrectly named executable. > Ralph > > > On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote: > >> Hello! >> I am a new user of openmpi -- I've installed openmpi 1.2.6 for our >> x86_64 linux scyld beowulf cluster inorder to make it run with amber10 >> MD simulation package. >> >> The nodes can see the home directory i.e. a bpsh to the nodes works >> fine and lists all the files in the home directory where I have both >> openmpi and amber10 installed. >> However if I try to run: >> >> $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ >> sander.MPI >> >> I get the following error: >> [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 >>
Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote: For your fortran issue, the Fortran 90 interface needs the Fortran 77 interface. So you need to supply an F77 as well (the output from configure should indicate that the F90 interface was disabled because the F77 interface was disabled). Is that what you mean (see below)? Ah yes -- that's another reason the f90 interface could be disabled: if configure detects that the f77 and f90 compilers are not link- compatible. I thought the g95 compiler could deal with F77 as well as F95... If so, could I just pass F77='g95'? That would probably work (F77=g95). I don't know the g95 compiler at all, so I don't know if it also accepts Fortran-77-style codes. But if it does, then you're set. Otherwise, specify a different F77 compiler that is link compatible with g95 and you should be good. I looked in some places in the OpenMPI code, but I couldn't find "max" being redefined anywhere, but I may be looking in the wrong places. Anyways, the only way of found of compiling OpenMPI was a very ugly hack: I have to go into those files and remove the "std::" before the "max". With that, it all compiled cleanly. I'm not sure I follow -- I don't see anywhere in OMPI where we use std::max. What areas did you find that you needed to change? These files are part of the standard C++ headers. In my case, they sit in: /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits Ah, I see. In principle, the problems that comes from those files would mean that the OpenMPI source has some macro redefining max, but that's what I could not find :-( Gotcha. I don't think we are defining a "max" macro anywhere in the ompi_info source or related header files. :-( No. We don't really maintain the "make check" stuff too well. Oh well... What do you use for testing the implementation? We have a whole pile of MPI tests in a private SVN repository. The repository is only private because it contains a lot of other people's [public] MPI test suites and benchmarks, and we never looked into redistribution rights for their software. There's nothing really secret about it -- we just haven't bothered to look into the IP issues. :-) We use the MPI Testing Tool (MTT) for nightly regression across the community: http://www.open-mpi.org/mtt/ We have weekday and weekend testing schedules. M-Th we do nightly tests; F-Mon morning, we do a long weekend schedule. This weekend, for example, we ran about 675k regression tests: http://www.open-mpi.org/mtt/index.php?do_redir=875 -- Jeff Squyres Cisco Systems
[OMPI users] switch from mpich2 to openMPI
I just found out I need to switch from mpich2 to openMPI for some code I'm running. I noticed that it's available in an openSuSE repo (I'm using openSuSE 11.0 x86_64 on a TYAN 32-processor Opteron 8000 system), but when I was using mpich2 I seemed to have better luck compiling it from code. This is the line I used: # $ F77=/path/to/g95 F90=/path/to/g95 ./configure --prefix=/some/place/mpich2-install But usually I left the "--prefix=" off and just let it install to it's default... which is /usr/local/bin and that's nice because it's already in the PATH and very usable. I guess my question is whether or not the defaults and configuration syntax have stayed the same in openMPI. I also could use a "quickstart" guide for a non-programming user (e.g., I think I have to start a daemon before running parallelized programs). THANKS!!! PattiM.
Re: [OMPI users] users Digest, Vol 1055, Issue 2
The problem is that you didn't specify or allocate any nodes for the job. At the least, you need to tell us what nodes to use via a hostfile. Alternatively, are you using a resource manager to assign the nodes? OMPI didn't see anything from one, but it could be that we just didn't see the right envar. Ralph On Nov 3, 2008, at 1:39 PM, Rima Chaudhuri wrote: Thanks a lot Ralph! I corrected the no_local to nolocal and now when I try to execute the script step1 (pls find it attached) [rchaud@helios amber10]$ ./step1 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 -- There are no available nodes allocated to this job. This could be because no nodes were found or all the available nodes were already used. Note that since the -nolocal option was given no processes can be launched on the local node. -- [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_support_fns.c at line 168 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmaps_rr.c at line 402 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file base/rmaps_base_map_job.c at line 210 [helios.structure.uic.edu:16335] [0,0,0] ORTE_ERROR_LOG: Temporarily out of resource in file rmgr_urm.c at line 372 [helios.structure.uic.edu:16335] mpirun: spawn failed with errno=-3 If I use the script without the --nolocal option, I get the following error: [helios.structure.uic.edu:20708] [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 thanks, On Mon, Nov 3, 2008 at 2:04 PM, wrote: Send users mailing list submissions to us...@open-mpi.org To subscribe or unsubscribe via the World Wide Web, visit http://www.open-mpi.org/mailman/listinfo.cgi/users or, via email, send a message with subject or body 'help' to users-requ...@open-mpi.org You can reach the person managing the list at users-ow...@open-mpi.org When replying, please edit your Subject line so it is more specific than "Re: Contents of users digest..." Today's Topics: 1. Scyld Beowulf and openmpi (Rima Chaudhuri) 2. Re: Scyld Beowulf and openmpi (Ralph Castain) 3. Problems installing in Cygwin - Problem with GCC 3.4.4 (Gustavo Seabra) 4. Re: MPI + Mixed language coding(Fortran90 + C++) (Jeff Squyres) 5. Re: Problems installing in Cygwin - Problem with GCC 3.4.4 (Jeff Squyres) -- Message: 1 Date: Mon, 3 Nov 2008 11:30:01 -0600 From: "Rima Chaudhuri" Subject: [OMPI users] Scyld Beowulf and openmpi To: us...@open-mpi.org Message-ID: <7503b17d0811030930i13acb974kc627983a1d481...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hello! I am a new user of openmpi -- I've installed openmpi 1.2.6 for our x86_64 linux scyld beowulf cluster inorder to make it run with amber10 MD simulation package. The nodes can see the home directory i.e. a bpsh to the nodes works fine and lists all the files in the home directory where I have both openmpi and amber10 installed. However if I try to run: $MPI_HOME/bin/mpirun -no_local=1 -np 4 $AMBERHOME/exe/ sander.MPI I get the following error: [0,0,0] ORTE_ERROR_LOG: Not available in file ras_bjs.c at line 247 -- Failed to find the following executable: Host: helios.structure.uic.edu Executable: -o Cannot continue. -- [helios.structure.uic.edu:23611] [0,0,0] ORTE_ERROR_LOG: Not found in file rmgr_urm.c at line 462 [helios.structure.uic.edu:23611] mpirun: spawn failed with errno=-13 any cues? -- -Rima -- Message: 2 Date: Mon, 3 Nov 2008 12:08:36 -0700 From: Ralph Castain Subject: Re: [OMPI users] Scyld Beowulf and openmpi To: Open MPI Users Message-ID: <91044a7e-ada5-4b94-aa11-b3c1d9843...@lanl.gov> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes For starters, there is no "-no_local" option to mpirun. You might want to look at mpirun --help, or man mpirun. I suspect the option you wanted was --nolocal. Note that --nolocal does not take an argument. Mpirun is confused by the incorrect option and looking for an incorrectly named executable. Ralph On Nov 3, 2008, at 10:30 AM, Rima Chaudhuri wrote: Hello! I am a new user of openmpi -- I've installed openmpi 1.2.6 for our x86_64 linux scyld beowulf cluster inorder to make it run with amber10 MD simulation package. The nodes can see the home directory i.e. a bpsh to the nodes works fine and lists all the files in the home directory where I have both openmpi and amber10 installed. However if I try to ru
Re: [OMPI users] switch from mpich2 to openMPI
On Nov 3, 2008, at 3:59 PM, PattiMichelle wrote: I just found out I need to switch from mpich2 to openMPI for some code I'm running. I noticed that it's available in an openSuSE repo (I'm using openSuSE 11.0 x86_64 on a TYAN 32-processor Opteron 8000 system), but when I was using mpich2 I seemed to have better luck compiling it from code. This is the line I used: # $ F77=/path/to/g95 F90=/path/to/g95 ./configure --prefix=/some/ place/mpich2-install Use FC=/path/to/g95 instead of F90. Better yet, but the F77 and FC after the ./configure: $ ./configure --prefix=/wherever FC=/path/to/g95 F77=/path/to/g95 But usually I left the "--prefix=" off and just let it install to it's default... which is /usr/local/bin and that's nice because it's already in the PATH and very usable. That would be fine as well. But ensure that you install MPICH[2] and Open MPI in two different prefixes -- we have a few executables and libraries that are the same name, so if you install them into the same location, they'll overwrite each other. I guess my question is whether or not the defaults and configuration syntax have stayed the same in openMPI. I also could use a "quickstart" guide for a non-programming user (e.g., I think I have to start a daemon before running parallelized programs). Our mpirun/mpiexec is a little different than MPICH[2]'s -- you might want to check out them man page. OMPI doesn't use user-started daemons for most cases; you should just be able to "mpirun ..." right out of the box. If you're not using a resource manager, you'll likely need to supply a hostfile, but OMPI's mpirun should support the same syntax as MPICH[2]'s hostfiles. Your MPI apps should compile with Open MPI if you use our wrapper compilers (mpicc, mpif90, etc.). Most well-written MPI apps will run properly with multiple MPI implementations, but it's certainly possible that you'll run into a few snags if you inadvertently coded your app to some particular characteristics of MPICH[2]. The best way to know is just to try running it and see what happens. -- Jeff Squyres Cisco Systems
Re: [OMPI users] Beowulf cluster and openmpi
I added the option for -hostfile machinefile where the machinefile is a file with the IP of the nodes: #host names 192.168.0.100 slots=2 192.168.0.101 slots=2 192.168.0.102 slots=2 192.168.0.103 slots=2 192.168.0.104 slots=2 192.168.0.105 slots=2 192.168.0.106 slots=2 192.168.0.107 slots=2 192.168.0.108 slots=2 192.168.0.109 slots=2 [rchaud@helios amber10]$ ./step1 -- A daemon (pid 29837) launched by the bproc PLS component on node 192 died unexpectedly so we are aborting. This may be because the daemon was unable to find all the needed shared libraries on the remote node. You may set your LD_LIBRARY_PATH to have the location of the shared libraries on the remote nodes and this will automatically be forwarded to the remote nodes. -- [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file pls_bproc.c at line 717 [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file pls_bproc.c at line 1164 [helios.structure.uic.edu:29836] [0,0,0] ORTE_ERROR_LOG: Error in file rmgr_urm.c at line 462 [helios.structure.uic.edu:29836] mpirun: spawn failed with errno=-1 I used bpsh to see if the master and one of the nodes n8 could see the $LD_LIBRARY_PATH, and it does.. [rchaud@helios amber10]$ echo $LD_LIBRARY_PATH /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib [rchaud@helios amber10]$ bpsh n8 echo $LD_LIBRARY_PATH /home/rchaud/openmpi-1.2.6/openmpi-1.2.6_ifort/lib thanks! On Mon, Nov 3, 2008 at 3:14 PM, wrote: > Send users mailing list submissions to >us...@open-mpi.org > > To subscribe or unsubscribe via the World Wide Web, visit >http://www.open-mpi.org/mailman/listinfo.cgi/users > or, via email, send a message with subject or body 'help' to >users-requ...@open-mpi.org > > You can reach the person managing the list at >users-ow...@open-mpi.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of users digest..." > > > Today's Topics: > > 1. Re: Problems installing in Cygwin - Problem with GCC 3.4.4 > (Jeff Squyres) > 2. switch from mpich2 to openMPI (PattiMichelle) > 3. Re: users Digest, Vol 1055, Issue 2 (Ralph Castain) > > > -- > > Message: 1 > Date: Mon, 3 Nov 2008 15:52:22 -0500 > From: Jeff Squyres > Subject: Re: [OMPI users] Problems installing in Cygwin - Problem with >GCC 3.4.4 > To: "Gustavo Seabra" > Cc: Open MPI Users > Message-ID: > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote: > >>> For your fortran issue, the Fortran 90 interface needs the Fortran 77 >>> interface. So you need to supply an F77 as well (the output from >>> configure >>> should indicate that the F90 interface was disabled because the F77 >>> interface was disabled). >> >> Is that what you mean (see below)? > > Ah yes -- that's another reason the f90 interface could be disabled: > if configure detects that the f77 and f90 compilers are not link- > compatible. > >> I thought the g95 compiler could >> deal with F77 as well as F95... If so, could I just pass F77='g95'? > > That would probably work (F77=g95). I don't know the g95 compiler at > all, so I don't know if it also accepts Fortran-77-style codes. But > if it does, then you're set. Otherwise, specify a different F77 > compiler that is link compatible with g95 and you should be good. I looked in some places in the OpenMPI code, but I couldn't find "max" being redefined anywhere, but I may be looking in the wrong places. Anyways, the only way of found of compiling OpenMPI was a very ugly hack: I have to go into those files and remove the "std::" before the "max". With that, it all compiled cleanly. >>> >>> I'm not sure I follow -- I don't see anywhere in OMPI where we use >>> std::max. >>> What areas did you find that you needed to change? >> >> These files are part of the standard C++ headers. In my case, they >> sit in: >> /usr/lib/gcc/i686-pc-cygwin/3.4.4/include/c++/bits > > Ah, I see. > >> In principle, the problems that comes from those files would mean that >> the OpenMPI source has some macro redefining max, but that's what I >> could not find :-( > > Gotcha. I don't think we are defining a "max" macro anywhere in the > ompi_info source or related header files. :-( > >>> No. We don't really maintain the "make check" stuff too well. >> >> Oh well... What do you use for testing the implementation? > > > We have a whole pile of MPI tests in a private SVN repository. The > repository is only private because it contains a lot of other people's > [public] MPI test suites and benchmarks, and we never looked into > redistribution rights for their software. There's nothing really > secret about it -- we just haven't
Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
> On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote: > >>> For your fortran issue, the Fortran 90 interface needs the Fortran 77 >>> interface. So you need to supply an F77 as well (the output from >>> configure >>> should indicate that the F90 interface was disabled because the F77 >>> interface was disabled). >> >> Is that what you mean (see below)? > > Ah yes -- that's another reason the f90 interface could be disabled: > if configure detects that the f77 and f90 compilers are not link- > compatible. > >> I thought the g95 compiler could >> deal with F77 as well as F95... If so, could I just pass F77='g95'? > > That would probably work (F77=g95). I don't know the g95 compiler at > all, so I don't know if it also accepts Fortran-77-style codes. But > if it does, then you're set. Otherwise, specify a different F77 > compiler that is link compatible with g95 and you should be good. Fortran 90 is a superset of the archaic, hamstrung, "I'm too old to learn how to program in a useful manner and I still use punched cards" Fortran 77. All Fortran 90 compilers are Fortran 77 compilers, by definition. Fortran 95 has a few (~5) deleted features and a few minor added features. I've never heard of a Fortran 95 compiler that wasn't a Fortran 90 compiler, and thus a Fortran 77 compiler. Take g77 and throw it away. While it's not particularly buggy, it hasn't been maintained for years and should be out-performed by a more modern compiler such as g95 or gfortran.
Re: [OMPI users] Problems installing in Cygwin - Problem with GCC 3.4.4
On Mon, Nov 3, 2008 at 8:59 PM, Terry Frankcombe wrote: >> On Nov 3, 2008, at 3:36 PM, Gustavo Seabra wrote: >> For your fortran issue, the Fortran 90 interface needs the Fortran 77 interface. So you need to supply an F77 as well (the output from configure should indicate that the F90 interface was disabled because the F77 interface was disabled). >>> >>> Is that what you mean (see below)? >> >> Ah yes -- that's another reason the f90 interface could be disabled: >> if configure detects that the f77 and f90 compilers are not link- >> compatible. >> >>> I thought the g95 compiler could >>> deal with F77 as well as F95... If so, could I just pass F77='g95'? >> >> That would probably work (F77=g95). I don't know the g95 compiler at >> all, so I don't know if it also accepts Fortran-77-style codes. But >> if it does, then you're set. Otherwise, specify a different F77 >> compiler that is link compatible with g95 and you should be good. > > Fortran 90 is a superset of the archaic, hamstrung, "I'm too old to learn > how to program in a useful manner and I still use punched cards" Fortran > 77. All Fortran 90 compilers are Fortran 77 compilers, by definition. > Fortran 95 has a few (~5) deleted features and a few minor added features. > I've never heard of a Fortran 95 compiler that wasn't a Fortran 90 > compiler, and thus a Fortran 77 compiler. > > Take g77 and throw it away. While it's not particularly buggy, it hasn't > been maintained for years and should be out-performed by a more modern > compiler such as g95 or gfortran. > OK, so I tried to set all my fortran compilers to g95... But for some reason, it looks like g95 and g95 (!) compilers are not compatible... How's that even possible? (See below) Thanks a lot, Gustavo. -- ./configure --prefix=$MPI_HOME \ --with-mpi-param_check=always \ --with-threads=posix \ --enable-mpi-threads \ --disable-io-romio \ --enable-mca-no-build=memory_mallopt,maffinity,paffinity \ --enable-contrib-no-build=vt \ FC='g95' F77='g95' F90='g95' F95='g95' \ CC='gcc' CXX='g++' \ FFLAGS='-O0 -fno-second-underscore' *** Fortran 77 compiler checking whether we are using the GNU Fortran 77 compiler... yes checking whether g95 accepts -g... yes checking if Fortran 77 compiler works... yes checking g95 external symbol convention... single underscore checking if C and Fortran 77 are link compatible... yes (just a ton of "yes"-es) *** Fortran 90/95 compiler checking whether we are using the GNU Fortran compiler... yes checking whether g95 accepts -g... yes checking if Fortran compiler works... yes checking whether g95 and g95 compilers are compatible... no configure: WARNING: *** Fortran 77 and Fortran 90 compilers are not link compatible configure: WARNING: *** Disabling MPI Fortran 90/95 bindings