On Jul 14, 2011, at 8:33 PM, dave fournier wrote:

> Sorry I should have said it doesn't get sent until the *master* encounters an 
> MPI_recv.
> Then suddenly the slave finally gets the message and carries on its task.
> 
> I know that the slave is waiting because:
> 1.) it doesn't print anything
> 2.) I have attached to it with gdb previously to monitor the behaviour.

Ah -- so you're saying that the master does something like this:

Time = A: Master calls MPI_Isend(msg, ..., &req);
Time = B: Master goes off and does other things
Time = C: Slave calls MPI_Recv(msg, ...);
Time = D: more time passes
Time = E: Master calls MPI_Recv(some_other_msg, ...);

And you're saying that the slave should be getting the message (more or less) 
instantly at Time=C, but instead gets it at Time=E, right?

If so, it's because Open MPI does not do background progress on non-blocking 
sends in all cases.  Specifically, if you're sending over TCP and the message 
is "long", the OMPI layer in the master doesn't actually send the whole message 
immediately because it doesn't want to unexpectedly consume a lot of resources 
in the slave.  So the master only sends a small fragment of the message and the 
communicator,tag tuple suitable for matching at the receiver. When the receiver 
posts a corresponding MPI_Recv (time=C), it sends back an ACK to the master, 
enabling the master to send the rest of the message.

However, since OMPI doesn't support background progress in all situations, the 
master doesn't see this ACK until it goes into the MPI progression engine -- 
i.e., when you call MPI_Recv() at Time=E.  Then the OMPI layer in the master 
sees the ACK and sends the rest of the message.

Make sense?

You can make quick dips into the OMPI progression engine by calling MPI_Test() 
on the request that you got back from MPI_Isend() -- e.g., you can do this at 
Time=B,C,D.  This is not as intrusive as calling MPI_Recv(), and may allow your 
message to be transferred earlier.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to