Brian,

I notice in the OMPI_INFO output the following parameters that seem relevant to this problem:

MCA btl: parameter "btl_self_free_list_num" (current value: "0")
MCA btl: parameter "btl_self_free_list_max" (current value: "-1")
MCA btl: parameter "btl_self_free_list_inc" (current value: "32")
MCA btl: parameter "btl_self_eager_limit" (current value: "131072")
MCA btl: parameter "btl_self_max_send_size" (current value: "262144")
MCA btl: parameter "btl_self_max_rdma_size" (current value: "2147483647")
MCA btl: parameter "btl_self_exclusivity" (current value: "65536")
MCA btl: parameter "btl_self_flags" (current value: "2")
MCA btl: parameter "btl_self_priority" (current value: "0")

Specifically the 'self_max_send_size=262144', which I assume is the maximum size (bytes?) message a processor can send to itself. None of the messages in my above tests approached this limit. However, I am puzzled by this, because the program below runs correctly for ridiculously large message sizes (as shown 200 Mbytes).

program test
!
implicit none
!
include 'mpif.h'
!
integer imjm
parameter (imjm=200000000)
!
integer itype , istrt , itau ,istat , msg , nstat
integer irank,nproc,iwin,i,n,ir
integer isizereal,iwinsize,itarget_disp
!
integer , dimension(:) , allocatable :: len
integer , dimension(:) , allocatable :: loff
real , dimension(:) , allocatable :: x
real , dimension(:) , allocatable :: ximjm
!
!
call mpi_init(istat)
call mpi_comm_rank(mpi_comm_world,irank,istat)
call mpi_comm_size(mpi_comm_world,nproc,istat)
!
allocate(len(nproc))
allocate(loff(nproc))
allocate(x(imjm/nproc))
!
ir = irank + 1
!
if(ir.eq.1)allocate(ximjm(imjm))
!
do 200 n = 1,nproc
len(n) = imjm/nproc
loff(n) = (n-1)*imjm/nproc
200 continue
!
call mpi_type_size(mpi_real,isizereal,istat)
!
iwinsize = imjm*isizereal
call mpi_win_create(ximjm,iwinsize,isizereal,mpi_info_null,
& mpi_comm_world,iwin,istat)
!
if(ir.eq.1)then
do 250 i = 1,imjm
ximjm(i) = i
250 continue
endif
!
itarget_disp = loff(ir)
call mpi_win_fence(0,iwin,istat)
call mpi_get(x,len(ir),mpi_real,0,itarget_disp,len(ir),mpi_real,
& iwin,istat)
call mpi_win_fence(0,iwin,istat)
!
print('(A,i3,8f20.2)'),' x ',ir,x(1),x(len(ir))
stop
end

Tom Rosmond




Brian Barrett wrote:

On Mon, 2006-09-04 at 11:01 -0700, Tom Rosmond wrote:

Attached is some error output from my tests of 1-sided message
passing, plus my info file.  Below are two copies of a simple fortran
subroutine that mimics mpi_allgatherv using  mpi-get calls.  The top
version fails, the bottom runs OK.  It seems clear from these
examples, plus the 'self_send' phrases in the error output, that there
is a problem internally with a processor sending data to itself.  I
know that your 'mpi_get' implementation is simply a wrapper around
'send/recv' calls, so clearly this shouldn't happen.  However, the
problem does not happen in all cases; I tried to duplicate it in a
simple stand-alone program with mpi_get calls and was unable to make
it fail.  Go figure.

That is an odd failure and at first glance it does look like there is
something wrong with our one-sided implementation.  I've filed a bug in
our tracker about the issue and you should get updates on the ticket as
we work on the issue.

Thanks,

Brian

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to