MPI_Intercomm_merge is broken in OpenMPI 1.1a4r9788 (and likely all
versions)
Details: the second argument, high, of MPI_Intercomm_merge is a
logical in Fortran (pg 216 of Using MPI) and an int in C. This now
correct with regards to the f90 interfaces in OpenMPI 1.1. The
meaning of "high" is as follows (from pg 313 MPI-The Complete
Reference):
If processes in one group provided the value high = false and
processes in the other group provided the value high = true then the
union orders the "low" group before the "high" group.
In other words if I have the following:
MPI process "parent" calls MPI_Intercomm_merge with high = .false.
( high = 0 in C)
MPI process "child" calls MPI_Intercomm_merge with high = .true.
(high = 1 in C)
then in the merged communicator - parent has rank 0 and child has
rank 1. This not happening in my tests on OS X 10.4.6 with g95;
however, my two alternative test systems handle this case as I expect
-- Debian Linux with MPICH2 1.0.3 (g95) and SGI MPI Library (sgi-
mpt-1.10.1-sgi301r1) (Intel Fotran 9.x).
The following test code is written to use the Fortran 90 interfaces
but it can be switched to the include file and fixed format source
code (.f) and should compile with both f90 and f77 compilers. I have
not written a C test code.
Michael
mpif90 parent4.f90 -o parent4
mpif90 child4.f90 -o child4
parent startup: 0 of 1
a child starting
parent spawned child process
child 0 of 1
parent merge comm: 1 of 2
ERROR: parent rank incorrect after merge
ERROR: child rank incorrect after merge
-- parent4.f90 --
program parent4
USE MPI
implicit none
integer ierr,size,rank,child,allmpi
integer k, subprocesses
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
write(6,*) 'parent startup: ', rank, ' of ', size
subprocesses = 1
call MPI_Comm_spawn('child4', MPI_ARGV_NULL,
subprocesses, &
& MPI_INFO_NULL, 0, MPI_COMM_WORLD, child,
MPI_ERRCODES_IGNORE, &
& ierr )
write(6,*) 'parent spawned child process'
call MPI_Intercomm_merge( child, .false., allmpi, ierr )
call MPI_COMM_RANK(allmpi,rank,ierr)
call MPI_COMM_SIZE(allmpi,size,ierr)
write(*,'(2(A,I3))') 'parent merge comm:',rank, ' of', size
if ( rank .ne. 0 ) then
write(6,*) 'ERROR: parent rank incorrect after merge'
end if
call MPI_COMM_FREE(allmpi,ierr)
call MPI_COMM_FREE(child,ierr)
call MPI_FINALIZE(ierr)
end
-- child4.f90 --
program child4
USE MPI
implicit none
integer :: ierr,size,rank,parent,rsize,allmpi
write(*,*) 'a child starting'
call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD,size,ierr)
write(*,'(2(A,I3))') 'child',rank,' of', size
call MPI_Comm_get_parent(parent,ierr)
call MPI_Intercomm_merge( parent, .true., allmpi, ierr )
call MPI_COMM_RANK(allmpi,rank,ierr)
call MPI_COMM_SIZE(allmpi,size,ierr)
if ( rank .eq. 0 ) then
write(6,*) 'ERROR: child rank incorrect after merge'
end if
call MPI_COMM_FREE(allmpi,ierr)
call MPI_COMM_FREE(parent,ierr)
call MPI_FINALIZE(ierr)
write(*,'(2(A,I3),A)') 'child',rank,' of',size,' exiting'
end
------------------------------------
On May 2, 2006, at 11:54 PM, Jeff Squyres (jsquyres) wrote:
Ok -- let me know what you find. I just checked and the code *looks*
right to me, but that doesn't mean that there isn't some deeper
implication that I'm missing.
-----Original Message-----
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Michael Kluskens
Sent: Tuesday, May 02, 2006 6:05 PM
To: Open MPI Users
Subject: Re: [OMPI users] openmpi-1.0.2 configure problem
My test codes compile fine but I'm fairly certain the logical is
being handled incorrectly. When I merge two comm's with one having
high=.false. and the other high=.true., the latter should go
into the
higher ranks and the former should contain rank 0.
I'll work it over again tomorrow and see if I can create an f77
version or use the mpi.h file and see if I can get a clear
difference
and I'll compare against MPICH2 but someone else should look into
this issue.
Michael
On May 1, 2006, at 11:57 PM, Jeff Squyres (jsquyres) wrote:
I just fixed the INTERCOMM_MERGE/logical issue on the trunk
and the
v1.1
branch -- can you give it a whirl there?
I ask because this issue is a bug that we fixed on the trunk (and
therefore v1.1) and didn't back-port it to v1.0. There's actually
quite
a few of these F90 fixes on the trunk/v1.1 branch that we did not
back-port to v1.0 (e.g., most of the other logical fixes) mainly
because
we thought you were the main consumer of the F90 MPI API (and
therefore
it wasn't worth it to back port :-) ). If you need all
these fixes in
v1.0, we can spend the time to do the back-port, but would prefer
not to
if possible.
-----Original Message-----
From: users-boun...@open-mpi.org
[mailto:users-boun...@open-mpi.org] On Behalf Of Michael Kluskens
Sent: Monday, May 01, 2006 6:20 PM
To: Open MPI Users
Subject: [OMPI users] openmpi-1.0.2 configure problem
checking if FORTRAN compiler supports integer(selected_int_kind
(2))... yes
checking size of FORTRAN integer(selected_int_kind(2))... unknown
configure: WARNING: *** Problem running configure test!
configure: WARNING: *** See config.log for details.
configure: error: *** Cannot continue.
Source code: openmpi-1.0.2 stable
OS X 10.4.5 with g95 (Apr 27 2006)
./configure F77=g95 FC=g95 LDFLAGS=-lSystemStubs
I find this rather surprising given that I have been regularly
building nightly snapshots of Open MPI 1.1 and 1.2 (the
other bug is
preventing me from using them at the moment till either I change my
code or the bugs gets fixed).
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users