Re: [OMPI users] MPI_Allreduce fail (minGW gfortran + OpenMPI 1.6.1)

2012-09-08 Thread Jeff Squyres
I am unable to replicate your problem, but admittedly I only have access to 
gfortran on Linux.  And I am definitely *not* a Fortran expert.  :-\

The code seems to run fine for me -- can you send another test program that 
actually tests the results of the all reduce?  Fortran allocatable stuff always 
confuses me; I wonder if perhaps we're not getting the passed pointer properly. 
 Checking the results of the all reduce would be a good way to check this 
theory.



On Sep 6, 2012, at 12:05 PM, Yonghui wrote:

> Dear mpi users and developers,
>  
> I am having some trouble with MPI_Allreduce. I am using MinGW (gcc 4.6.2) 
> with OpenMPI 1.6.1. The MPI_Allreduce in c version works fine, but the 
> fortran version failed with error. Here is the simple fortran code to 
> reproduce the error:
>  
> program main
> implicit none
> include 'mpif.h'
> character * (MPI_MAX_PROCESSOR_NAME) 
> processor_name
> integer myid, numprocs, namelen, rc, ierr
> integer, allocatable :: mat1(:, :, :)
>  
> call MPI_INIT( ierr )
> call MPI_COMM_RANK( MPI_COMM_WORLD, myid, 
> ierr )
> call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, 
> ierr )
> allocate(mat1(-36:36, -36:36, -36:36))
> mat1(:,:,:) = 111
> print *, "Going to call MPI_Allreduce."
> call MPI_Allreduce(MPI_IN_PLACE, mat1(-36, 
> -36, -36), 389017, MPI_INTEGER, MPI_BOR, MPI_COMM_WORLD, ierr)
> print *, "MPI_Allreduce done!!!"
> call MPI_FINALIZE(rc)
> endprogram
>  
> The command that I used to compile:
> gfortran Allreduce.f90 -IC:\OpenMPI-win32\include 
> C:\OpenMPI-win32\lib\libmpi_f77.lib
>  
> The MPI_Allreduce fail. [xxx:02112] [[17193,0],0]-[[17193,1],0] 
> mca_oob_tcp_msg_recv: readv failed: Unknown error (108).
> I am not sure why this happens. But I think it is the windows build MPI 
> problem. Since the simple code works on a Linux system with gfortran.
>  
> Any ideas? I appreciate any response!
>  
> Yonghui
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] Setting RPATH for Open MPI libraries

2012-09-08 Thread Jed Brown
Is there a way to configure Open MPI to use RPATH without needing to
manually specify --with-wrapper-ldflags=-Wl,-rpath,${prefix}/lib (and
similar for non-GNU-compatible compilers)?


Re: [OMPI users] Setting RPATH for Open MPI libraries

2012-09-08 Thread Reuti
Hi,

Am 08.09.2012 um 14:46 schrieb Jed Brown:

> Is there a way to configure Open MPI to use RPATH without needing to manually 
> specify --with-wrapper-ldflags=-Wl,-rpath,${prefix}/lib (and similar for 
> non-GNU-compatible compilers)? ___

What do you want to achieve in detail - just shorten the `./configure` command 
line? You could also add it after Open MPI's compilation in the text file:

${prefix}/share/openmpi/mpicc-wrapper-data.txt 

-- Reuti


> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] gcc problem compiling openmpi-1.6.1 on Solaris 10 sparc

2012-09-08 Thread Jeff Squyres
Siegmar --

Can you test the 1.6.2rc tarball and see if the problem is resolved?

http://www.open-mpi.org/software/ompi/v1.6/



On Aug 27, 2012, at 7:04 AM, Siegmar Gross wrote:

> Hi,
> 
> we have installed the latest patches on our Solaris machines and I have
> a problem compiling openmpi-1.6.1 with gcc-4.6.2. I used the following
> commands.
> 
> mkdir openmpi-1.6.1-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc
> cd openmpi-1.6.1-${SYSTEM_ENV}.${MACHINE_ENV}.64_gcc
> 
> ../openmpi-1.6.1/configure --prefix=/usr/local/openmpi-1.6.1_64_gcc \
>  --libdir=/usr/local/openmpi-1.6.1_64_gcc/lib64 \
>  LDFLAGS="-m64 -L/usr/local/gcc-4.6.2/lib/sparcv9" \
>  CC="gcc" CXX="g++" F77="gfortran" FC="gfortran" \
>  CFLAGS="-m64" CXXFLAGS="-m64" FFLAGS="-m64" FCFLAGS="-m64" \
>  CPP="cpp" CXXCPP="cpp" \
>  CPPFLAGS="" CXXCPPFLAGS="" \
>  C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
>  OBJC_INCLUDE_PATH="" MPIHOME="" \
>  --without-udapl --without-openib \
>  --enable-mpi-f90 --with-mpi-f90-size=small \
>  --enable-heterogeneous --enable-cxx-exceptions \
>  --enable-orterun-prefix-by-default \
>  --with-threads=posix --enable-mpi-thread-multiple \
>  --enable-opal-multi-threads \
>  --with-hwloc=internal --with-ft=LAM --enable-sparse-groups \
>  |& tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.64_gcc
> 
> make |& tee log.make.$SYSTEM_ENV.$MACHINE_ENV.64_gcc
> 
> "log.configure.SunOS.sparc.64_gcc" shows no errors.
> "log.make.SunOS.sparc.64_gcc" breaks with the following error.
> 
> ...
> Creating mpi/man/man3/OpenMPI.3 man page...
> make[2]: Leaving directory
>  `/.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi'
> Making all in mpi/cxx
> make[2]: Entering directory
>  `/.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi/mpi/cxx'
>  CXXmpicxx.lo
> In file included from /usr/include/stdio.h:66:0,
>  from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.h:50,
>  from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.cc:22:
> /usr/include/iso/stdio_iso.h:195:60: error:
>  redefinition of 'const char* restrict'
> /usr/include/iso/stdio_iso.h:195:32: error:
>  'const char* restrict' previously declared here
> /usr/include/iso/stdio_iso.h:197:16: error:
>  redefinition of 'const char* restrict'
> /usr/include/iso/stdio_iso.h:196:34: error:
>  'const char* restrict' previously declared here
> /usr/include/iso/stdio_iso.h:197:38: error:
>  conflicting declaration 'FILE* restrict'
> /usr/include/iso/stdio_iso.h:196:34: error:
>  'restrict' has a previous declaration as 'const char* restrict'
> /usr/include/iso/stdio_iso.h:198:48: error:
>  conflicting declaration 'char* restrict'
> /usr/include/iso/stdio_iso.h:198:26: error:
>  'restrict' has a previous declaration as 'FILE* restrict'
> ...
> 
> Many lines of similar errors.
> 
> ...
> In file included from
>  ../../../../openmpi-1.6.1/ompi/mpi/cxx/functions_inln.h:22:0,
>  from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.h:272,
>  from ../../../../openmpi-1.6.1/ompi/mpi/cxx/mpicxx.cc:22:
> /usr/include/string.h:65:57: error:
>  conflicting declaration 'const char* restrict'
> /usr/include/string.h:65:29: error:
>  'restrict' has a previous declaration as 'char* restrict'
> /usr/include/string.h:66:9: error:
>  conflicting declaration 'char** restrict'
> /usr/include/string.h:65:29: error:
>  'restrict' has a previous declaration as 'char* restrict'
> /usr/include/string.h:71:56: error:
>  conflicting declaration 'const void* restrict'
> /usr/include/string.h:71:28: error:
>  'restrict' has a previous declaration as 'void* restrict'
> /usr/include/string.h:77:53: error:
>  conflicting declaration 'void* restrict'
> /usr/include/string.h:77:31: error:
>  'restrict' has a previous declaration as 'const void* restrict'
> /usr/include/string.h:78:56: error:
>  conflicting declaration 'void* restrict'
> /usr/include/string.h:78:34: error:
>  'restrict' has a previous declaration as 'const void* restrict'
> make[2]: *** [mpicxx.lo] Error 1
> make[2]: Leaving directory `.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi/mpi/cxx'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `.../openmpi-1.6.1-SunOS.sparc.64_gcc/ompi'
> make: *** [all-recursive] Error 1
> tyr openmpi-1.6.1-SunOS.sparc.64_gcc 127 
> 
> 
> Has anyone else a similar problem? Is our system responsible for the
> problem so that I must open a service request or is something wrong
> with openmpi? Thank you very much for any help in advance.
> 
> 
> Kind regards
> 
> Siegmar
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] OpenMPI 1.6.1 with Intel Cluster Studio 2012

2012-09-08 Thread Jeff Squyres
Did this ever get a followup?  If not...

We've seen problems with specific versions of the Intel compiler that randomly 
set fault in strange places.  The only solution was the upgrade the Intel 
compiler to the latest version in that series.


On Aug 29, 2012, at 11:24 AM, Paul Edmon wrote:

> I'm trying to build OpenMPI 1.6.1 with Intel Cluster Studio 2012 for use on 
> our cluster however I'm running into a bit of a problem.  The OpenMPI itself 
> builds fine as does the code I'm testing with. However when I try to run on 
> more than about 16 processors the run seg faults and when I turn on traceback 
> it seems to point to this call:
> 
> CALL MPI_INIT_THREAD(MPI_THREAD_FUNNELED, provided, ierr)
> 
> This is a pretty standard call. Even stranger when I use a different version 
> of the Intel Compiler 12.0.3 20110309 it doesn't have any problems.
> 
> The version of the Intel compiler in Intel Cluster Studio 2012 is 12.1.0 
> 20110811.
> 
> Does any one have any thoughts about this error?  I've tried turning off any 
> optimization to no avail.  I even tried turned on debug mode for OpenMPI but 
> I can't get any thing more specific as to why it is failing.  I even tried 
> compiling Intel MPI Benchmark, which failed in a similar way, which indicates 
> that its a problem specifically with the interaction of MPI and the intel 
> compiler and not the code I was working with.
> 
> Thanks.
> 
> -Paul Edmon-
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/




Re: [OMPI users] OpenMPI 1.6.1 with Intel Cluster Studio 2012

2012-09-08 Thread Paul Edmon
Interesting.  I figured that might be the case.  I will have to contact 
Intel and find out if we can get a newer version.


Thanks.

-Paul Edmon-

On 9/8/2012 3:18 PM, Jeff Squyres wrote:

Did this ever get a followup?  If not...

We've seen problems with specific versions of the Intel compiler that randomly 
set fault in strange places.  The only solution was the upgrade the Intel 
compiler to the latest version in that series.


On Aug 29, 2012, at 11:24 AM, Paul Edmon wrote:


I'm trying to build OpenMPI 1.6.1 with Intel Cluster Studio 2012 for use on our 
cluster however I'm running into a bit of a problem.  The OpenMPI itself builds 
fine as does the code I'm testing with. However when I try to run on more than 
about 16 processors the run seg faults and when I turn on traceback it seems to 
point to this call:

CALL MPI_INIT_THREAD(MPI_THREAD_FUNNELED, provided, ierr)

This is a pretty standard call. Even stranger when I use a different version of 
the Intel Compiler 12.0.3 20110309 it doesn't have any problems.

The version of the Intel compiler in Intel Cluster Studio 2012 is 12.1.0 
20110811.

Does any one have any thoughts about this error?  I've tried turning off any 
optimization to no avail.  I even tried turned on debug mode for OpenMPI but I 
can't get any thing more specific as to why it is failing.  I even tried 
compiling Intel MPI Benchmark, which failed in a similar way, which indicates 
that its a problem specifically with the interaction of MPI and the intel 
compiler and not the code I was working with.

Thanks.

-Paul Edmon-
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] MPI_Allreduce fail (minGW gfortran + OpenMPI 1.6.1)

2012-09-08 Thread Yonghui
Hi Jeff,



Thanks for your response. Like you said, the code works for you in a Linux
system. And I am sure that the code works on Linux and even Mac os x. But if
you use MinGW (basically you have all gnu things on windows) to compile, the
code abort when running to MPI_Allreduce.



In my opinion, fortran don't visit the memory address directly. In c you use
a memory address as the receive buf, but in fortran you just pass a number
(the number defined as a macro, MPI_IN_PLACE in my example) to the
subroutine (as a wrapper, the code pass the correct address to c function
when it sees the number).



PS: the fortran function allocate can be used to dynamically tell the system
to make enough room for a matrix. Then you have a matrix instead of a
pointer. In general, you don't need to taking care of the RAM address in
fortran. If you know the name and the index of a matrix, then you have
everything. Though people introduce the concept "pointer" in fortran 90, but
to me is something similar to reference in c. I think this is just want to
introduce some data structure things.



You can find MinGW here: http://sourceforge.net/projects/mingw/files/

And it can be used by just extracting. If you compile my little code with
MinGW gfortran, then you'll see the program abort. I have no idea of
checking it. It probably a windows related error, since MinGW has nothing to
do with POSIX.



That's what I can tell so far. Any suggestions?



Yonghui