On Apr 23, 2014, at 4:45 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:

>> is OK. So, if any nonblocking calls are used, one must use mpi.test or
>> mpi.wait to check if they are complete before trying any blocking calls.

That is also correct -- it's MPI semantics (communications initiated by 
MPI_Isend / MPI_Irecv must be completed via one of the flavors of MPI_Test or 
MPI_Wait).

> That sounds like a different problem than the one I encountered.  The
> system did get hung up, but the reason was that processes received
> corrupted R objects, threw an error, and stopped responding.
> 
> The root of my problem was that objects got garbage collected before the
> isend completed.  

This is definitely a problem with garbage collecting languages.  MPI needs to 
control the buffer until the corresponding Test/Wait indicates that MPI has 
finished with the buffer.

If the buffer disappears / is changed from underneath MPI, 
unpredictable/undefined behavior can certainly result.

> This will happen regardless of subsequent R-level
> calls (e.g., to mpi.test).  The object to be transmitted is serialized
> and passed to C, but when the call returns there are no R references to
> the object--that is, the serialized version of the object--and so it is
> subject to garbage collection.

Yep.

> I'd be happy to provide my modifications to get around this.  Although
> they worked for me, they are not really suitable for general use.  There
> are 2 main issues: first, I ignored the asynchronous receive since I
> didn't use it.  Since MPI request objects are used for both sending and
> receiving, I suspect that mixing irecv's in with code doing isends would
> not work right.  

FWIW, this works fine.  It's quite common (in C and Fortran) to mix various 
kinds of MPI_Request handles into a single array-based Test or Wait.  MPI 
figures it out.

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to