On 4/4/2014 6:01 PM, Ralph Castain wrote:
It sounds like you don't have a balance between sends and recvs somewhere -
i.e., some apps send messages, but the intended recipient isn't issuing a recv
and waiting until the message has been received before exiting. If the
recipient leaves before the isend completes, then the isend will never complete
and the waitall will not return.
I'm pretty sure the sends complete because I wait on something that can
only be computed after the sends complete, and I know I have that result.
My current theory is that my modifications to Rmpi are not properly
tracking all completed messages, resulting in it thinking there are
outstanding messages (and passing a positive count to the C-level
MPI_Waitall with associated garbagey arrays). But I haven't isolated
the problem.
Ross
On Apr 4, 2014, at 5:20 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:
During shutdown of my application the processes issue a waitall, since they
have done some Isends. A couple of them never return from that call.
Could this be the result of some of the processes already being shutdown (the
processes with the problem were late in the shutdown sequence)? If so, what is
the recommended solution? A barrier?
The shutdown proceeds in stages, but the processes in question are not told to
shutdown until all the messages they have sent have been received. So there
shouldn't be any outstanding messages from them.
My reading of the manual is that Waitall with a count of 0 should return
immediately, not hang. Is that correct?
Running under R with openmpi 1.7.4.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users