On Dec 3, 2009, at 10:56 AM, Brock Palen wrote: > The allocation statement is ok: > allocate(vec(vec_size,vec_per_proc*(size-1))) > > This allocates memory vec(32768, 2350)
So this allocates 32768 rows, each with 2350 columns -- all stored contiguously in memory, in column-major order. Does the language/compiler *guarantee* that the entire matrix is contiguous in memory? Or does it only guarantee that the *columns* are contiguous in memory -- and there may be gaps between successive columns? 2350 means you're running with 48 procs. In the loop: do irank=1,size-1 do ivec=1,vec_per_proc write (6,*) 'irank=',irank,'ivec=',ivec vec_ind=(irank-1)*vec_per_proc+ivec call MPI_RECV( vec(1,vec_ind), vec_size, MPI_DOUBLE_COMPLEX, irank, & vec_ind, MPI_COMM_WORLD, status, ierror) This means that in the first iteration, you're calling: irank = 1 ivec = 1 vec_ind = (47 - 1) * 50 + 1 = call MPI_RECV(vec(1, 2301), 32768, ...) And in the last iteration, you're calling: irank = 47 ivec = 50 vec_ind = (47 - 1) * 50 + 50 = call MPI_RECV(vec(1, 2350), 32768, ...) That doesn't seem right. If I'm reading this right -- and I very well may not be -- it looks like successive receives will be partially overlaying the previous receive. Is that what you intended? Is MPI supposed to overflow the columns properly? I can see how a problem might occur here if the columns are not actually contiguous in memory...? -- Jeff Squyres jsquy...@cisco.com