On 4/4/2014 6:01 PM, Ralph Castain wrote:
It sounds like you don't have a balance between sends and recvs somewhere - 
i.e., some apps send messages, but the intended recipient isn't issuing a recv 
and waiting until the message has been received before exiting. If the 
recipient leaves before the isend completes, then the isend will never complete 
and the waitall will not return.
I'm pretty sure the sends complete because I wait on something that can only be computed after the sends complete, and I know I have that result.

My current theory is that my modifications to Rmpi are not properly tracking all completed messages, resulting in it thinking there are outstanding messages (and passing a positive count to the C-level MPI_Waitall with associated garbagey arrays). But I haven't isolated the problem.

Ross


On Apr 4, 2014, at 5:20 PM, Ross Boylan <r...@biostat.ucsf.edu> wrote:

During shutdown of my application the processes issue a waitall, since they 
have done some Isends.  A couple of them never return from that call.

Could this be the result of some of the processes already being shutdown (the 
processes with the problem were late in the shutdown sequence)?  If so, what is 
the recommended solution?  A barrier?

The shutdown proceeds in stages, but the processes in question are not told to 
shutdown until all the messages they have sent have been received.  So there 
shouldn't be any outstanding messages from them.

My reading of the manual is that Waitall with a count of 0 should return 
immediately, not hang.  Is that correct?

Running under R with openmpi 1.7.4.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to