On 9/8/2011 11:47 AM, Ghislain Lartigue wrote:
I guess you're perfectly right!
I will try to test it tomorrow by putting a call system("wait(X)) befor the 
barrier!
What does "wait(X)" mean?

Anyhow, here is how I see your computation:

A)  The first barrier simply synchronizes the processes.
B)  Then you start a bunch of non-blocking, point-to-point messages.
C)  Then another barrier.
D)  Finally, the point-to-point messages are completed.

Your mental model might be that A, B, and C should be fast and that D should take a long time. The reality may be that the completion of all those messages is actually taking place during C.

How about the following?

Barrier
t0 = MPI_Wtime()
start all non-blocking messages
t1 = MPI_Wtime()
Barrier
t2 = MPI_Wtime()
complete all messages
t3 = MPI_Wtime()
Barrier
t4 = MPI_Wtime()

Then, look at the data from all the processes graphically. Compare the picture to the same experiment, but with middle Barrier missing. Presumably, the full iteration will take roughly as long in both cases. The difference, I might expect, would be that with the middle barrier present, it gets all the time and the message-completion is fast. Without the middle barrier, the message completion is slow. So, message completion is taking a long time either way and the only difference is whether it's taking place during your MPI_Test loop or during what you thought was only a barrier.

A simple way of doing all this is to run with a time-line profiler... some MPI performance analysis tool. You won't have to instrument the code, dump timings, or figure out graphics. Just look at pretty pictures! There is some description of tool candidates in the OMPI FAQ at http://www.open-mpi.org/faq/?category=perftools
PS:
if anyone has more information about the implementation of the MPI_IRECV() 
procedure, I would be glad to learn more about it!
I don't know how much detail you want here, but I suspect not much detail is warranted. There is a lot of complexity here, but I think a few key ideas will help.

First, I'm pretty sure you're sending "long" messages. OMPI usually sends such messages by queueing up a request. These requests can, in the general case, be "progressed" whenever an MPI call is made. So, whenever you make an MPI call, get away from the thought that you're doing one specific thing, as specified by the call and its arguments. Think instead that you will also be looking around to see whatever other MPI work can be progressed.

Reply via email to