amjad ali wrote:

I am parallelizing a CFD 2D code in FORTRAN+OPENMPI. Suppose that the grid (all triangles) is partitioned among 8 processes using METIS. Each process has different number of neighboring processes. Suppose each process has n elements/faces whose data it needs to sends to corresponding neighboring processes, and it has m number of elements/faces on which it needs to get data from corresponding neighboring processes. Values of n and m are different for each process. Another aim is to hide the communication behind computation. For this I do the following for each process:

 

DO j = 1 to n

CALL MPI_ISEND (send_data, num, type, dest(j), tag, MPI_COMM_WORLD, ireq(j), ierr)

ENDDO

 

DO k = 1 to m

CALL MPI_RECV(recv_data, num, type, source(k), tag, MPI_COMM_WORLD, status, ierr)

ENDDO

 

 This solves my problem. But it gives memory leakage; Ram gets filled after few thousands of iteration. What is the solution/remedy? How should I tackle this?

 

In another CFD code I removed this problem of memory-filling by following (in that code n=m) :

 

DO j = 1 to n

CALL MPI_ISEND (send_data, num, type, dest(j), tag, MPI_COMM_WORLD, ireq(j), ierr)

ENDDO

 

CALL MPI_WAITALL(n,ireq,status,ierr)

 

DO k = 1 to n

CALL MPI_RECV(recv_data, num, type, source(k), tag, MPI_COMM_WORLD, status, ierr)

ENDDO

 

But this is not working in current code; and the previous code was not giving correct results with large number of processes.

I don't know how literally to read the code you sent.  Maybe your actual code "does the right thing", but just to confirm I think the correct code should look like this:

DO J=1, N
   CALL MPI_ISEND(...)
END DO

DO K=1, M
   CALL MPI_RECV(...)
END DO

CALL MPI_WAITALL(...)

That is, you start all non-blocking sends.  Then you perform receives.  Then you complete the sends.  More commonly, one would post all receives first using non-blocking calls (MPI_IRECV), then perform all sends (MPI_SEND), then complete the receives with MPI_WAITALL.

Yet another option is to post non-blocking receives, then non-blocking sends, then complete all sends and receives with a WAITALL call that has M+N requests.

Sorry if you already knew all this and I'm just overreacting to the simplified code you sent out.

Reply via email to