[OMPI users] Restarting from a checkpoint (OMPI 1.3)
Hey, I'm trying the new released Open MPI 1.3 in conjunction with BLCR to provide the checkpoint/restart-feature. Configured with ./configure --prefix=/usr/local --with-ft=cr --enable-ft-thread --enable-mpi-threads --with-blcr=/ A MPI-job on a single machine (several threads) is checkpointed and restarted very well. The checkpoint of a MPI-job across two hosts (ethernet, tcp) is also done without warnings or errors (the homedir and the directory, where the MPI-Application is, are shared with NFS). The restart works too, but all threads are only started on the host, where I enter the ompi-restart command. Even if I add the -hostfile argument to ompi-restart, only the one host is used. Does anybody has a hint? Thanks, Gregor
[OMPI users] Open-MPI 1.3 and environment variables
Under 1.2.8 i could check OMPI_MCA_ns_nds_vpid to find out the process rank. Under 1.3 that variable does not seem to exist anymore. Is there an equivalent to hat variable in 1.3? Have any other environment variables changed? Thank You Jody
Re: [OMPI users] Problem compiling open mpi 1.3 with sunstudio12 express
Can you send the information listed here: http://www.open-mpi.org/community/help/ On Jan 19, 2009, at 11:49 AM, Olivier Marsden wrote: Hello, I'm trying to compile ompi 1.3rc7 with the sun studio express comilers. I'm using the following configure command: CC=/opt/sun/express/sunstudioceres/bin/cc CXX=/opt/sun/express/ sunstudioceres/bin/CC F77=/opt/sun/express/sunstudioceres/bin/f77 FC=/opt/sun/express/sunstudioceres/bin/f90 ./configure --prefix=/ opt/mpi_sun --enable-heterogeneous --enable-shared --enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads --disable-progress- threads --disable-debug --without-udapl --disable-io-romio The build and install execute correctly. However, I get the following when trying to use mpif90: >> /opt/mpi_sun/bin/mpif90 gfortran: no input files My /opt/mpi_sun/share/openmpi/mpif90-wrapper-data.txt file appears to my layman eye to be correct, but just in case, its contents is the following: project=Open MPI project_short=OMPI version=1.3rc7 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=/opt/sun/express/sunstudioceres/bin/f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags= linker_flags= libs=-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,-- export-dynamic -lnsl -lutil -lm -ldl required_file= includedir=${includedir} libdir=${libdir} Can anyone see why gfortran is being used? (the config.log says that sun f90 is used ) Thanks, Olivier ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
[OMPI users] error opal/libltdl
Hi... I am trying to install openmpi-1.2.8 on REHL-4. I am getting following error. [clususer@vlsiserver openmpi-1.2.8]$ make all install Making all in config make[1]: Entering directory `/home/clususer/openmpi-1.2.8/config' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/config' Making all in contrib make[1]: Entering directory `/home/clususer/openmpi-1.2.8/contrib' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/contrib' Making all in opal make[1]: Entering directory `/home/clususer/openmpi-1.2.8/opal' Making all in include make[2]: Entering directory `/home/clususer/openmpi-1.2.8/opal/include' make all-am make[3]: Entering directory `/home/clususer/openmpi-1.2.8/opal/include' make[3]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/include' make[2]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/include' Making all in libltdl make[2]: Entering directory `/home/clususer/openmpi-1.2.8/opal/libltdl' make all-am make[3]: Entering directory `/home/clususer/openmpi-1.2.8/opal/libltdl' /bin/sh ./libtool --tag=CC --mode=link gcc -O3 -DNDEBUG -module -avoid-version -o dlopen.la dlopen.lo -ldl -ldl -lnsl -lutil -lm libtool: link: false cru .libs/dlopen.a .libs/dlopen.o make[3]: *** [dlopen.la] Error 1 make[3]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/libltdl' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/libltdl' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/opal' make: *** [all-recursive] Error 1 [clususer@vlsiserver openmpi-1.2.8]$
Re: [OMPI users] Open-MPI 1.3 and environment variables
That was never an envar for public use, but rather one that was used internal to OMPI and therefore subject to change (which it did). There are a number of such variables in the system - we try to indicate this by not exposing them via ompi_info. Of course, you can independently discover them with a printenv, but we would not recommend relying on them. The list of reliable envars is provided here: http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables Note that this begins with 1.3 and does not apply to any prior releases. Ralph On Jan 20, 2009, at 4:43 AM, jody wrote: Under 1.2.8 i could check OMPI_MCA_ns_nds_vpid to find out the process rank. Under 1.3 that variable does not seem to exist anymore. Is there an equivalent to hat variable in 1.3? Have any other environment variables changed? Thank You Jody ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Open-MPI 1.3 and environment variables
Thanks for the clarification. Speaking of 'printenv' - i noticed that even though $HOSTNAME is set on all of my machines: aim-nano_03 ~ # echo $HOSTNAME aim-nano_03 it does not appear in printenv's output: aim-nano_03 opt # printenv | grep HOST aim-nano_03 opt # Is there something special about $HOSTNAME and how or when it is set? Jody On Tue, Jan 20, 2009 at 3:26 PM, Ralph Castain wrote: > That was never an envar for public use, but rather one that was used > internal to OMPI and therefore subject to change (which it did). There are a > number of such variables in the system - we try to indicate this by not > exposing them via ompi_info. Of course, you can independently discover them > with a printenv, but we would not recommend relying on them. > > The list of reliable envars is provided here: > > http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables > > Note that this begins with 1.3 and does not apply to any prior releases. > > Ralph > > > On Jan 20, 2009, at 4:43 AM, jody wrote: > >> Under 1.2.8 i could check >> OMPI_MCA_ns_nds_vpid >> to find out the process rank. >> Under 1.3 that variable does not seem to exist anymore. >> Is there an equivalent to hat variable in 1.3? >> Have any other environment variables changed? >> >> Thank You >> Jody >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] Open-MPI 1.3 and environment variables
Not that I know of...I would expect it to work. On Jan 20, 2009, at 8:47 AM, jody wrote: Thanks for the clarification. Speaking of 'printenv' - i noticed that even though $HOSTNAME is set on all of my machines: aim-nano_03 ~ # echo $HOSTNAME aim-nano_03 it does not appear in printenv's output: aim-nano_03 opt # printenv | grep HOST aim-nano_03 opt # Is there something special about $HOSTNAME and how or when it is set? Jody On Tue, Jan 20, 2009 at 3:26 PM, Ralph Castain wrote: That was never an envar for public use, but rather one that was used internal to OMPI and therefore subject to change (which it did). There are a number of such variables in the system - we try to indicate this by not exposing them via ompi_info. Of course, you can independently discover them with a printenv, but we would not recommend relying on them. The list of reliable envars is provided here: http://www.open-mpi.org/faq/?category=running#mpi-environmental-variables Note that this begins with 1.3 and does not apply to any prior releases. Ralph On Jan 20, 2009, at 4:43 AM, jody wrote: Under 1.2.8 i could check OMPI_MCA_ns_nds_vpid to find out the process rank. Under 1.3 that variable does not seem to exist anymore. Is there an equivalent to hat variable in 1.3? Have any other environment variables changed? Thank You Jody ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem compiling open mpi 1.3 with sunstudio12 express
Certainly. I hope I haven't forgotten anything. Olivier Marsden Jeff Squyres wrote: Can you send the information listed here: http://www.open-mpi.org/community/help/ On Jan 19, 2009, at 11:49 AM, Olivier Marsden wrote: Hello, I'm trying to compile ompi 1.3rc7 with the sun studio express comilers. I'm using the following configure command: CC=/opt/sun/express/sunstudioceres/bin/cc CXX=/opt/sun/express/sunstudioceres/bin/CC F77=/opt/sun/express/sunstudioceres/bin/f77 FC=/opt/sun/express/sunstudioceres/bin/f90 ./configure --prefix=/opt/mpi_sun --enable-heterogeneous --enable-shared --enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads --disable-progress-threads --disable-debug --without-udapl --disable-io-romio The build and install execute correctly. However, I get the following when trying to use mpif90: >> /opt/mpi_sun/bin/mpif90 gfortran: no input files My /opt/mpi_sun/share/openmpi/mpif90-wrapper-data.txt file appears to my layman eye to be correct, but just in case, its contents is the following: project=Open MPI project_short=OMPI version=1.3rc7 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=/opt/sun/express/sunstudioceres/bin/f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags= linker_flags= libs=-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl required_file= includedir=${includedir} libdir=${libdir} Can anyone see why gfortran is being used? (the config.log says that sun f90 is used ) Thanks, Olivier ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ompi-output.tar.bz2 Description: application/bzip
Re: [OMPI users] Problem compiling open mpi 1.3 with sunstudio12 express
Thanks, that was helpful. From everything I can see, it looks like the ".../f90" value was propagated properly throughout OMPI's code base. Does f90 invoke gfortran on the back-end? Try invoking "/opt/sun/ express/sunstudioceres/bin/f90" with no arguments (just like you invoked mpif90 with no arguments) and see if you get the sam error. You can also invoke "mpif90 --showme" to see the command that mpif90 would have executed. On Jan 20, 2009, at 11:15 AM, Olivier Marsden wrote: Certainly. I hope I haven't forgotten anything. Olivier Marsden Jeff Squyres wrote: Can you send the information listed here: http://www.open-mpi.org/community/help/ On Jan 19, 2009, at 11:49 AM, Olivier Marsden wrote: Hello, I'm trying to compile ompi 1.3rc7 with the sun studio express comilers. I'm using the following configure command: CC=/opt/sun/express/sunstudioceres/bin/cc CXX=/opt/sun/express/ sunstudioceres/bin/CC F77=/opt/sun/express/sunstudioceres/bin/ f77 FC=/opt/sun/express/sunstudioceres/bin/f90 ./configure -- prefix=/opt/mpi_sun --enable-heterogeneous --enable-shared -- enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads -- disable-progress-threads --disable-debug --without-udapl -- disable-io-romio The build and install execute correctly. However, I get the following when trying to use mpif90: >> /opt/mpi_sun/bin/mpif90 gfortran: no input files My /opt/mpi_sun/share/openmpi/mpif90-wrapper-data.txt file appears to my layman eye to be correct, but just in case, its contents is the following: project=Open MPI project_short=OMPI version=1.3rc7 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=/opt/sun/express/sunstudioceres/bin/f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags= linker_flags= libs=-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl - Wl,--export-dynamic -lnsl -lutil -lm -ldl required_file= includedir=${includedir} libdir=${libdir} Can anyone see why gfortran is being used? (the config.log says that sun f90 is used ) Thanks, Olivier ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] Problem compiling open mpi 1.3 with sunstudio12 express
f90 works correctly, when run simply as f90 or as /opt/sun/etc.../f90, and binaries run properly (sun f90 appears to give excellent performance, incidently!) the command /opt/mpi_sun/bin/mpif90 --show-me returns: /home/marsden/sources/gcc_final/bin/gfortran -I/opt/mpi_gfortran4.4//include -pthread -I/opt/mpi_gfortran4.4//lib -L/opt/mpi_gfortran4.4//lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl For what it's worth, and as you've probably guessed, I do have another version installation of openmpi. In fact two, one with the system gcc/gfortran4.2, and one with a locally compiled gcc/gfortran4.4 . These both work correctly. The second installation seems to be interfering with my current attempt, even though I exported all environment variables I can think of to point to sun compilers & libraries first, before configure & compile. Jeff Squyres wrote: Thanks, that was helpful. From everything I can see, it looks like the ".../f90" value was propagated properly throughout OMPI's code base. Does f90 invoke gfortran on the back-end? Try invoking "/opt/sun/express/sunstudioceres/bin/f90" with no arguments (just like you invoked mpif90 with no arguments) and see if you get the sam error. You can also invoke "mpif90 --showme" to see the command that mpif90 would have executed. On Jan 20, 2009, at 11:15 AM, Olivier Marsden wrote: Certainly. I hope I haven't forgotten anything. Olivier Marsden Jeff Squyres wrote: Can you send the information listed here: http://www.open-mpi.org/community/help/ On Jan 19, 2009, at 11:49 AM, Olivier Marsden wrote: Hello, I'm trying to compile ompi 1.3rc7 with the sun studio express comilers. I'm using the following configure command: CC=/opt/sun/express/sunstudioceres/bin/cc CXX=/opt/sun/express/sunstudioceres/bin/CC F77=/opt/sun/express/sunstudioceres/bin/f77 FC=/opt/sun/express/sunstudioceres/bin/f90 ./configure --prefix=/opt/mpi_sun --enable-heterogeneous --enable-shared --enable-mpi-f90 --with-mpi-f90-size=small --disable-mpi-threads --disable-progress-threads --disable-debug --without-udapl --disable-io-romio The build and install execute correctly. However, I get the following when trying to use mpif90: >> /opt/mpi_sun/bin/mpif90 gfortran: no input files My /opt/mpi_sun/share/openmpi/mpif90-wrapper-data.txt file appears to my layman eye to be correct, but just in case, its contents is the following: project=Open MPI project_short=OMPI version=1.3rc7 language=Fortran 90 compiler_env=FC compiler_flags_env=FCFLAGS compiler=/opt/sun/express/sunstudioceres/bin/f90 module_option=-M extra_includes= preprocessor_flags= compiler_flags= linker_flags= libs=-lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,--export-dynamic -lnsl -lutil -lm -ldl required_file= includedir=${includedir} libdir=${libdir} Can anyone see why gfortran is being used? (the config.log says that sun f90 is used ) Thanks, Olivier ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Problem compiling open mpi 1.3 with sunstudio12 express
On Jan 20, 2009, at 5:04 PM, Olivier Marsden wrote: f90 works correctly, when run simply as f90 or as /opt/sun/etc.../ f90, and binaries run properly (sun f90 appears to give excellent performance, incidently!) the command /opt/mpi_sun/bin/mpif90 --show-me returns: /home/marsden/sources/gcc_final/bin/gfortran -I/opt/mpi_gfortran4.4// include -pthread -I/opt/mpi_gfortran4.4//lib -L/opt/mpi_gfortran4.4// lib -lmpi_f90 -lmpi_f77 -lmpi -lopen-rte -lopen-pal -ldl -Wl,-- export-dynamic -lnsl -lutil -lm -ldl Interesting. For what it's worth, and as you've probably guessed, I do have another version installation of openmpi. In fact two, one with the system gcc/gfortran4.2, and one with a locally compiled gcc/gfortran4.4 . These both work correctly. The second installation seems to be interfering with my current attempt, even though I exported all environment variables I can think of to point to sun compilers & libraries first, before configure & compile. I have oodles of installations of OMPI on my machines; they don't interfere with each other. So let's see if we can figure out why yours don't seem to play well together: - Check that /opt/mpi_sun and /opt/mpi_gfortran* are actually distinct subdirectories; there's no hidden sym/hard links in there somewhere (where directories and/or individual files might accidentally be pointing to the other tree) - does "env | grep mpi_" show anything interesting / revealing? What is your LD_LIBRARY_PATH set to? - what does ldd on the various .so files in /opt/mpi_sun/lib/ show? Are they linked against files in their own tree, or the other tree? - run "mpif90 --showme" through strace; does it show anything illuminating? -- Jeff Squyres Cisco Systems
Re: [OMPI users] error opal/libltdl
Can you send all the information listed here: http://www.open-mpi.org/community/help/ On Jan 20, 2009, at 8:08 AM, nilesh barange wrote: Hi... I am trying to install openmpi-1.2.8 on REHL-4. I am getting following error. [clususer@vlsiserver openmpi-1.2.8]$ make all install Making all in config make[1]: Entering directory `/home/clususer/openmpi-1.2.8/config' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/config' Making all in contrib make[1]: Entering directory `/home/clususer/openmpi-1.2.8/contrib' make[1]: Nothing to be done for `all'. make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/contrib' Making all in opal make[1]: Entering directory `/home/clususer/openmpi-1.2.8/opal' Making all in include make[2]: Entering directory `/home/clususer/openmpi-1.2.8/opal/ include' make all-am make[3]: Entering directory `/home/clususer/openmpi-1.2.8/opal/ include' make[3]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/include' make[2]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/include' Making all in libltdl make[2]: Entering directory `/home/clususer/openmpi-1.2.8/opal/ libltdl' make all-am make[3]: Entering directory `/home/clususer/openmpi-1.2.8/opal/ libltdl' /bin/sh ./libtool --tag=CC --mode=link gcc -O3 -DNDEBUG -module - avoid-version -o dlopen.la dlopen.lo -ldl -ldl -lnsl -lutil -lm libtool: link: false cru .libs/dlopen.a .libs/dlopen.o make[3]: *** [dlopen.la] Error 1 make[3]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/libltdl' make[2]: *** [all] Error 2 make[2]: Leaving directory `/home/clususer/openmpi-1.2.8/opal/libltdl' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/home/clususer/openmpi-1.2.8/opal' make: *** [all-recursive] Error 1 [clususer@vlsiserver openmpi-1.2.8]$ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems