Hmm. I can now replicate this on OSX as well, but I'm not sure I
agree with all of your analysis. Here's what I get from an OMPI SVN
trunk build:
[9:34] rtp-jsquyres-8718:~/bogus/lib % foreach file (`ls *.0.dylib`)
foreach? echo ================= $file
foreach? nm $file | grep in_place
foreach? end
================= libmca_common_sm.0.dylib
================= libmpi.0.dylib
0011a638 D _MPI_FORTRAN_IN_PLACE
0011a634 D _mpi_fortran_in_place
0011a63c D _mpi_fortran_in_place_
0011a640 D _mpi_fortran_in_place__
================= libmpi_cxx.0.dylib
00008144 S __ZN3MPI8IN_PLACEE
================= libmpi_f77.0.dylib
U _MPI_FORTRAN_IN_PLACE
U _mpi_fortran_in_place
U _mpi_fortran_in_place_
U _mpi_fortran_in_place__
================= libopen-pal.0.dylib
================= libopen-rte.0.dylib
0007f2b4 D _orte_snapc_base_store_in_place
The __Z symbol is in libmpi_cxx, so I don't think it's relevant here
(that's the part that I disagree about). But notice that my
*fortran_in_place*/i symbols are "D" in libmpi (where they are
defined) and U in libmpi_f77. This is different than your output.
Here's the output from a 1.3.3 build:
[9:55] rtp-jsquyres-8718:~/bogus/1.3/lib % !for
foreach file ( `ls *.0.dylib` )
foreach? echo =========== $file
foreach? nm $file | grep in_place
foreach? end
=========== libmca_common_sm.0.dylib
=========== libmpi.0.dylib
000a4d30 S _MPI_FORTRAN_IN_PLACE
000a4d34 S _mpi_fortran_in_place
000a4d38 S _mpi_fortran_in_place_
000a4d3c S _mpi_fortran_in_place__
=========== libmpi_cxx.0.dylib
00007328 S __ZN3MPI8IN_PLACEE
=========== libmpi_f77.0.dylib
U _mpi_fortran_in_place_
=========== libopen-pal.0.dylib
=========== libopen-rte.0.dylib
00036eea D _orte_snapc_base_store_in_place
Here's the output from a 1.2.9 build:
[9:35] rtp-jsquyres-8718:~/bogus/1.2/lib % foreach file ( `ls *.
0.dylib` )
foreach? echo ============= $file
foreach? nm $file | grep in_place
foreach? end
============= libmca_common_sm.0.dylib
============= libmpi.0.dylib
00093950 S _MPI_FORTRAN_IN_PLACE
00093954 S _mpi_fortran_in_place
00093958 S _mpi_fortran_in_place_
0009395c S _mpi_fortran_in_place__
============= libmpi_cxx.0.dylib
0000e00c D __ZN3MPI8IN_PLACEE
============= libmpi_f77.0.dylib
U _mpi_fortran_in_place_
============= libopen-pal.0.dylib
============= libopen-rte.0.dylib
Notes:
1. I can't see lib libmpi_cxx affects anything in the f90 app
2. The trunk builds have the symbols as D's, but the 1.2 and 1.3
builds have them as S's.
3. Build and run with 1.2 works, build and run with 1.3 fails.
Inserting output statements in the runs, I can see that 1.2 correctly
detects MPI_IN_PLACE but 1.3 and trunk do not.
So it's something more than S vs. D, and I don't believe that the
libmpi_cxx symbols is involved. This is definitely a bug. Doh! With
a *brief* code examination, I don't see any substantive code changes
between 1.2.x and the SVN trunk/v1.3, but we definitely did change
versions of Libtool. I wonder if this is involved somehow.
I don't have the cycles at the moment to investigate, but I've filed a
blocker ticket against OMPI 1.3.4:
https://svn.open-mpi.org/trac/ompi/ticket/1982
I made it a blocker because I assume this also affects all the other
"special" constants in Fortran, like MPI_BOTTOM.
On Aug 4, 2009, at 5:38 AM, Ricardo Fonseca wrote:
Hi Jeff
This is a Mac OS X (10.5.7) specific issue, that occurs for all
versions > 1.2.9 that I've tested (1.3.0 through the 1.4 nightly),
regardless of what fortran compiler you use (ifort / g95 /
gfortran). I've been able to replicate this issue on other OS X
machines, and I am sure that I am using the correct headers /
libraries. Version 1.2.9 is working correctly. Here are some system
details:
$ uname -a
Darwin zamblap.epp.ist.utl.pt 9.7.0 Darwin Kernel Version 9.7.0: Tue
Mar 31 22:52:17 PDT 2009; root:xnu-1228.12.14~1/RELEASE_I386 i386
$ gcc --version
i686-apple-darwin9-gcc-4.0.1 (GCC) 4.0.1 (Apple Inc. build 5493)
$ ld -v
@(#)PROGRAM:ld PROJECT:ld64-85.2.1
This might be a (again, Mac OS X specific) libtool issue. If you
look at the name list of the generated .dylib libraries for 1.3.3
you get:
$ nm /opt/openmpi/1.3.3-g95-32/lib/*.dylib | grep -i in_place
000a4d30 S _MPI_FORTRAN_IN_PLACE
000a4d34 S _mpi_fortran_in_place
000a4d38 S _mpi_fortran_in_place_
000a4d3c S _mpi_fortran_in_place__
000a4d30 S _MPI_FORTRAN_IN_PLACE
000a4d34 S _mpi_fortran_in_place
000a4d38 S _mpi_fortran_in_place_
000a4d3c S _mpi_fortran_in_place__
00007328 S __ZN3MPI8IN_PLACEE
00007328 S __ZN3MPI8IN_PLACEE
U _mpi_fortran_in_place__
U _mpi_fortran_in_place__
00036eea D _orte_snapc_base_store_in_place
00036eea D _orte_snapc_base_store_in_place
But for 1.2.9 you get:
$ nm /opt/openmpi/1.2.9-g95-32/lib/*.dylib | grep -i in_place
00093950 S _MPI_FORTRAN_IN_PLACE
00093954 S _mpi_fortran_in_place
00093958 S _mpi_fortran_in_place_
0009395c S _mpi_fortran_in_place__
00093950 S _MPI_FORTRAN_IN_PLACE
00093954 S _mpi_fortran_in_place
00093958 S _mpi_fortran_in_place_
0009395c S _mpi_fortran_in_place__
0000e00c D __ZN3MPI8IN_PLACEE
0000e00c D __ZN3MPI8IN_PLACEE
U _mpi_fortran_in_place__
U _mpi_fortran_in_place__
So the __ZN3MPI8IN_PLACEE symbol, that I guess refers to the Fortran
MPI_IN_PLACE constant is being defined incorrectly in the 1.3.3
version as a S (symbol in a section other than those above), while
it should be defined as a D (data section symbol) as part of an
"external" common block, as it happens in 1.2.9. So when linking the
1.3.3 version the MPI_IN_PLACE constant will never have the same
address as any of the mpi_fortran_in_place variables, but rather its
own address.
Thanks again for your help,
Ricardo
---
Prof. Ricardo Fonseca
GoLP - Grupo de Lasers e Plasmas
Instituto de Plasmas e Fusão Nuclear
Instituto Superior Técnico
Av. Rovisco Pais
1049-001 Lisboa
Portugal
tel: +351 21 8419202
fax: +351 21 8464455
web: http://cfp.ist.utl.pt/golp/
On Aug 1, 2009, at 17:00 , users-requ...@open-mpi.org wrote:
Message: 2
Date: Sat, 1 Aug 2009 07:44:47 -0400
From: Jeff Squyres <jsquy...@cisco.com>
Subject: Re: [OMPI users] OMPI users] MPI_IN_PLACE in Fortran
withMPI_REDUCE / MPI_ALLREDUCE
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <ca25ccf4-c5e7-47c0-a24e-8b05b59a6...@cisco.com>
Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes
Hmm. FWIW, I'm unable to replicate your error. I tried with the
OMPI
SVN trunk and a build of the OMPI 1.3.3 tarball using the GNU
compiler
suite on RHEL4U5.
I've even compiled your sample code with "mpif90" using the "use mpi"
statement -- I did not get an unclassifiable statement. What version
of Open MPI are you using? Please sent the info listed here:
http://www.open-mpi.org/community/help/
Can you confirm that you're not accidentally mixing and matching
multiple versions of Open MPI?
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
jsquy...@cisco.com