Jeff Squyres wrote:
On Dec 3, 2009, at 10:56 AM, Brock Palen wrote:
  
The allocation statement is ok:
allocate(vec(vec_size,vec_per_proc*(size-1)))

This allocates memory vec(32768, 2350)
    
So this allocates 32768 rows, each with 2350 columns -- all stored contiguously in memory, in column-major order.  Does the language/compiler *guarantee* that the entire matrix is contiguous in memory?  Or does it only guarantee that the *columns* are contiguous in memory -- and there may be gaps between successive columns?
  
I think you're getting one big contiguous block of memory and the portions that are passed are contiguous, nonoverlapping pieces.
This means that in the first iteration, you're calling:
call MPI_RECV(vec(1, 2301), 32768, ...)

And in the last iteration, you're calling:
call MPI_RECV(vec(1, 2350), 32768, ...)

That doesn't seem right.  If I'm reading this right -- and I very well may not be -- it looks like successive receives will be partially overlaying the previous receive.
No.  In Fortran, leftmost index varies the fastest.  E.g.,

% cat y.f90
  integer a(2,2)
  a(1,1) = 11
  a(2,1) = 21
  a(1,2) = 12
  a(2,2) = 22
  call sub(a)
end

subroutine sub(a)
  integer a(4)
  write(6,*) a
end
% a.out
 11 21 12 22
%

Here is how I think of Brock's code:

program sendbuf

  include 'mpif.h'

  integer, parameter :: n = 32 * 1024, m = 50

  complex*16 buf(n)

  call MPI_INIT(ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD, np, ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD, me, ierr)

  buf = 0

  if ( me == 0 ) then
     do i = 1, np-1
        do j = 1, m
           call MPI_RECV(buf, n, MPI_DOUBLE_COMPLEX, i, j, MPI_COMM_WORLD, MPI_STATUS_IGNORE, ierr)
        end do
     end do
  else
     do j = 1, m
        call MPI_SEND( x, n, MPI_DOUBLE_COMPLEX, 0, j, MPI_COMM_WORLD, ierr)
     end do
  end if

  call MPI_FINALIZE(ierr)
end


This version reuses send and receive buffers, but that's fine since they're all blocking calls anyhow.

Reply via email to