Re: [OMPI users] MPI_Allreduce fail (minGW gfortran + OpenMPI 1.6.1)
I am unable to replicate your problem, but admittedly I only have access to gfortran on Linux. And I am definitely *not* a Fortran expert. :-\ The code seems to run fine for me -- can you send another test program that actually tests the results of the all reduce? Fortran allocatable stuff always confuses me; I wonder if perhaps we're not getting the passed pointer properly. Checking the results of the all reduce would be a good way to check this theory. On Sep 6, 2012, at 12:05 PM, Yonghui wrote: > Dear mpi users and developers, > > I am having some trouble with MPI_Allreduce. I am using MinGW (gcc 4.6.2) > with OpenMPI 1.6.1. The MPI_Allreduce in c version works fine, but the > fortran version failed with error. Here is the simple fortran code to > reproduce the error: > > program main > implicit none > include 'mpif.h' > character * (MPI_MAX_PROCESSOR_NAME) > processor_name > integer myid, numprocs, namelen, rc, ierr > integer, allocatable :: mat1(:, :, :) > > call MPI_INIT( ierr ) > call MPI_COMM_RANK( MPI_COMM_WORLD, myid, > ierr ) > call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, > ierr ) > allocate(mat1(-36:36, -36:36, -36:36)) > mat1(:,:,:) = 111 > print *, "Going to call MPI_Allreduce." > call MPI_Allreduce(MPI_IN_PLACE, mat1(-36, > -36, -36), 389017, MPI_INTEGER, MPI_BOR, MPI_COMM_WORLD, ierr) > print *, "MPI_Allreduce done!!!" > call MPI_FINALIZE(rc) > endprogram > > The command that I used to compile: > gfortran Allreduce.f90 -IC:\OpenMPI-win32\include > C:\OpenMPI-win32\lib\libmpi_f77.lib > > The MPI_Allreduce fail. [xxx:02112] [[17193,0],0]-[[17193,1],0] > mca_oob_tcp_msg_recv: readv failed: Unknown error (108). > I am not sure why this happens. But I think it is the windows build MPI > problem. Since the simple code works on a Linux system with gfortran. > > Any ideas? I appreciate any response! > > Yonghui > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] Setting RPATH for Open MPI libraries
Is there a way to configure Open MPI to use RPATH without needing to manually specify --with-wrapper-ldflags=-Wl,-rpath,${prefix}/lib (and similar for non-GNU-compatible compilers)?
Re: [OMPI users] Setting RPATH for Open MPI libraries
Hi, Am 08.09.2012 um 14:46 schrieb Jed Brown: > Is there a way to configure Open MPI to use RPATH without needing to manually > specify --with-wrapper-ldflags=-Wl,-rpath,${prefix}/lib (and similar for > non-GNU-compatible compilers)? ___ What do you want to achieve in detail - just shorten the `./configure` command line? You could also add it after Open MPI's compilation in the text file: ${prefix}/share/openmpi/mpicc-wrapper-data.txt -- Reuti > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] gcc problem compiling openmpi-1.6.1 on Solaris 10 sparc
Siegmar -- Can you test the 1.6.2rc tarball and see if the problem is resolved? http://www.open-mpi.org/software/ompi/v1.6/ On Aug 27, 2012, at 7:04 AM, Siegmar Gross wrote: > Hi, > > we have installed the latest patches on our Solaris machines and I have > a problem compiling openmpi-1.6.1 with gcc-4.6.2. I used the following > commands. > > mkdir openmpi-1.6.1-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc > cd openmpi-1.6.1-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc > > ../openmpi-1.6.1/configure --prefix=/usr/local/openmpi-1.6.1_64_gcc \ > --libdir=/usr/local/openmpi-1.6.1_64_gcc/lib64 \ > LDFLAGS="-m64 -L/usr/local/gcc-4.6.2/lib/sparcv9" \ > CC="gcc" CXX="g++" F77="gfortran" FC="gfortran" \ > CFLAGS="-m64" CXXFLAGS="-m64" FFLAGS="-m64" FCFLAGS="-m64" \ > CPP="cpp" CXXCPP="cpp" \ > CPPFLAGS="" CXXCPPFLAGS="" \ > C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \ > OBJC_INCLUDE_PATH="" MPIHOME="" \ > --without-udapl --without-openib \ > --enable-mpi-f90 --with-mpi-f90-size=small \ > --enable-heterogeneous --enable-cxx-exceptions \ > --enable-orterun-prefix-by-default \ > --with-threads=posix --enable-mpi-thread-multiple \ > --enable-opal-multi-threads \ > --with-hwloc=internal --with-ft=LAM --enable-sparse-groups \ > |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > > make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_gcc > > "log.configure.SunOS.sparc.64_gcc" shows no errors. > "log.make.SunOS.sparc.64_gcc" breaks with the following error. > > ... > Creating mpi/man/man3/OpenMPI.3 man page... > make[2]: Leaving directory > `/.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi' > Making all in mpi/cxx > make[2]: Entering directory > `/.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi/mpi/cxx' > CXXmpicxx.lo > In file included from /usr/include/stdio.h:66:0, > from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.h:50, > from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.cc:22: > /usr/include/iso/stdio_iso.h:195:60: error: > redefinition of 'const char* restrict' > /usr/include/iso/stdio_iso.h:195:32: error: > 'const char* restrict' previously declared here > /usr/include/iso/stdio_iso.h:197:16: error: > redefinition of 'const char* restrict' > /usr/include/iso/stdio_iso.h:196:34: error: > 'const char* restrict' previously declared here > /usr/include/iso/stdio_iso.h:197:38: error: > conflicting declaration 'FILE* restrict' > /usr/include/iso/stdio_iso.h:196:34: error: > 'restrict' has a previous declaration as 'const char* restrict' > /usr/include/iso/stdio_iso.h:198:48: error: > conflicting declaration 'char* restrict' > /usr/include/iso/stdio_iso.h:198:26: error: > 'restrict' has a previous declaration as 'FILE* restrict' > ... > > Many lines of similar errors. > > ... > In file included from > ../../../../openmpi-1.6.1/ompi/mpi/cxx/functions_inln.h:22:0, > from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.h:272, > from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.cc:22: > /usr/include/string.h:65:57: error: > conflicting declaration 'const char* restrict' > /usr/include/string.h:65:29: error: > 'restrict' has a previous declaration as 'char* restrict' > /usr/include/string.h:66:9: error: > conflicting declaration 'char** restrict' > /usr/include/string.h:65:29: error: > 'restrict' has a previous declaration as 'char* restrict' > /usr/include/string.h:71:56: error: > conflicting declaration 'const void* restrict' > /usr/include/string.h:71:28: error: > 'restrict' has a previous declaration as 'void* restrict' > /usr/include/string.h:77:53: error: > conflicting declaration 'void* restrict' > /usr/include/string.h:77:31: error: > 'restrict' has a previous declaration as 'const void* restrict' > /usr/include/string.h:78:56: error: > conflicting declaration 'void* restrict' > /usr/include/string.h:78:34: error: > 'restrict' has a previous declaration as 'const void* restrict' > make[2]: *** [mpicxx.lo] Error 1 > make[2]: Leaving directory `.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi/mpi/cxx' > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory `.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi' > make: *** [all-recursive] Error 1 > tyr openmpi-1.6.1-SunOS.sparc.64_gcc 127 > > > Has anyone else a similar problem? Is our system responsible for the > problem so that I must open a service request or is something wrong > with openmpi? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] OpenMPI 1.6.1 with Intel Cluster Studio 2012
Did this ever get a followup? If not... We've seen problems with specific versions of the Intel compiler that randomly set fault in strange places. The only solution was the upgrade the Intel compiler to the latest version in that series. On Aug 29, 2012, at 11:24 AM, Paul Edmon wrote: > I'm trying to build OpenMPI 1.6.1 with Intel Cluster Studio 2012 for use on > our cluster however I'm running into a bit of a problem. The OpenMPI itself > builds fine as does the code I'm testing with. However when I try to run on > more than about 16 processors the run seg faults and when I turn on traceback > it seems to point to this call: > > CALL MPI_INIT_THREAD(MPI_THREAD_FUNNELED, provided, ierr) > > This is a pretty standard call. Even stranger when I use a different version > of the Intel Compiler 12.0.3 20110309 it doesn't have any problems. > > The version of the Intel compiler in Intel Cluster Studio 2012 is 12.1.0 > 20110811. > > Does any one have any thoughts about this error? I've tried turning off any > optimization to no avail. I even tried turned on debug mode for OpenMPI but > I can't get any thing more specific as to why it is failing. I even tried > compiling Intel MPI Benchmark, which failed in a similar way, which indicates > that its a problem specifically with the interaction of MPI and the intel > compiler and not the code I was working with. > > Thanks. > > -Paul Edmon- > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
Re: [OMPI users] OpenMPI 1.6.1 with Intel Cluster Studio 2012
Interesting. I figured that might be the case. I will have to contact Intel and find out if we can get a newer version. Thanks. -Paul Edmon- On 9/8/2012 3:18 PM, Jeff Squyres wrote: Did this ever get a followup? If not... We've seen problems with specific versions of the Intel compiler that randomly set fault in strange places. The only solution was the upgrade the Intel compiler to the latest version in that series. On Aug 29, 2012, at 11:24 AM, Paul Edmon wrote: I'm trying to build OpenMPI 1.6.1 with Intel Cluster Studio 2012 for use on our cluster however I'm running into a bit of a problem. The OpenMPI itself builds fine as does the code I'm testing with. However when I try to run on more than about 16 processors the run seg faults and when I turn on traceback it seems to point to this call: CALL MPI_INIT_THREAD(MPI_THREAD_FUNNELED, provided, ierr) This is a pretty standard call. Even stranger when I use a different version of the Intel Compiler 12.0.3 20110309 it doesn't have any problems. The version of the Intel compiler in Intel Cluster Studio 2012 is 12.1.0 20110811. Does any one have any thoughts about this error? I've tried turning off any optimization to no avail. I even tried turned on debug mode for OpenMPI but I can't get any thing more specific as to why it is failing. I even tried compiling Intel MPI Benchmark, which failed in a similar way, which indicates that its a problem specifically with the interaction of MPI and the intel compiler and not the code I was working with. Thanks. -Paul Edmon- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] MPI_Allreduce fail (minGW gfortran + OpenMPI 1.6.1)
Hi Jeff, Thanks for your response. Like you said, the code works for you in a Linux system. And I am sure that the code works on Linux and even Mac os x. But if you use MinGW (basically you have all gnu things on windows) to compile, the code abort when running to MPI_Allreduce. In my opinion, fortran don't visit the memory address directly. In c you use a memory address as the receive buf, but in fortran you just pass a number (the number defined as a macro, MPI_IN_PLACE in my example) to the subroutine (as a wrapper, the code pass the correct address to c function when it sees the number). PS: the fortran function allocate can be used to dynamically tell the system to make enough room for a matrix. Then you have a matrix instead of a pointer. In general, you don't need to taking care of the RAM address in fortran. If you know the name and the index of a matrix, then you have everything. Though people introduce the concept "pointer" in fortran 90, but to me is something similar to reference in c. I think this is just want to introduce some data structure things. You can find MinGW here: http://sourceforge.net/projects/mingw/files/ And it can be used by just extracting. If you compile my little code with MinGW gfortran, then you'll see the program abort. I have no idea of checking it. It probably a windows related error, since MinGW has nothing to do with POSIX. That's what I can tell so far. Any suggestions? Yonghui