On Apr 23, 2014, at 4:45 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:
>> is OK. So, if any nonblocking calls are used, one must use mpi.test or >> mpi.wait to check if they are complete before trying any blocking calls. That is also correct -- it's MPI semantics (communications initiated by MPI_Isend / MPI_Irecv must be completed via one of the flavors of MPI_Test or MPI_Wait). > That sounds like a different problem than the one I encountered. The > system did get hung up, but the reason was that processes received > corrupted R objects, threw an error, and stopped responding. > > The root of my problem was that objects got garbage collected before the > isend completed. This is definitely a problem with garbage collecting languages. MPI needs to control the buffer until the corresponding Test/Wait indicates that MPI has finished with the buffer. If the buffer disappears / is changed from underneath MPI, unpredictable/undefined behavior can certainly result. > This will happen regardless of subsequent R-level > calls (e.g., to mpi.test). The object to be transmitted is serialized > and passed to C, but when the call returns there are no R references to > the object--that is, the serialized version of the object--and so it is > subject to garbage collection. Yep. > I'd be happy to provide my modifications to get around this. Although > they worked for me, they are not really suitable for general use. There > are 2 main issues: first, I ignored the asynchronous receive since I > didn't use it. Since MPI request objects are used for both sending and > receiving, I suspect that mixing irecv's in with code doing isends would > not work right. FWIW, this works fine. It's quite common (in C and Fortran) to mix various kinds of MPI_Request handles into a single array-based Test or Wait. MPI figures it out. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/