Arunkumar C R wrote:
Hello,
I have written an MPI code to find acceleration (a = F/m) of a system
of particles. I would like to know how to perform MPI_GATHER of all
the output data at the root process and print the whole array (in the
present case, f(1:n) ) and reuse it for some other purpose in the
serial part of the same code. Please check the program given below
and give me a solution.
program g
use mpi
implicit none
real(kind(1d0)),allocatable,dimension(:)::a, f
real(kind(1d0))::m
integer::i, n, true, ierr, np, irank, istart, iend, var, nprocs
nprocs= 4 !no: of processors used
n = 100 !no: of array elements
m = 12.0d0 !mass of the moving particle
var = n / nprocs !no: of array elements per process
What is n is not a multiple of nprocs? E.g., on the last process, will
you write out of bounds?
allocate(a(n), f(n), stat=true)
if(true.ne.0) print*,'mem err'
call mpi_init(ierr)
call mpi_comm_size(mpi_comm_world, np, ierr)
Should you check that np==nprocs?
call mpi_comm_rank(mpi_comm_world, irank, ierr)
istart= irank * var + 1
iend = (irank + 1) * var
do i= istart, iend
f(i) = dble(i) !the magnitude of force is
taken as continuous (real) numbers !!
a(i) = f(i) / m
end do
You're allocating a full-size array on each process. I guess that's
okay, but it isn't necessary.
Anyhow, you can try:
call
MPI_Gather(a(istart),var,MPI_REAL8,a,var,MPI_REAL8,0,MPI_COMM_WORLD,ierr)
Strictly speaking, this is not correct since the send and receive
buffers overlap. So, the real thing to do is to declare a full array
a(1:n) (only necessary on the root rank 0) and a "local" array
a_local(istart:iend) on each process.
if(irank.eq.0) then !root process
do i=1,n
print*,i,a(i) !here, all the root
elements are printed correctly and the other elements as zeroes
end do
end if
call mpi_finalize(ierr)
end