Re: [OMPI users] mca_pml_ob1_send blocks

Shaun Jackman Mon, 31 Aug 2009 18:12:50 -0400

Shaun Jackman wrote:

Jeff Squyres wrote:
On Aug 26, 2009, at 10:38 AM, Jeff Squyres (jsquyres) wrote:
Yes, this could cause blocking.  Specifically, the receiver may not
advance any other senders until the matching Irecv is posted and is
able to make progress.
I should clarify something else here -- for long messages where thepipeline protocol is used, OB1 may need to be invoked repeatedly tokeep making progress on all the successive fragments. I.e., if a sendis long enough to entail many fragments, then OB1 may (read: likelywill) not progress *all* of them simultaneously. Hence, if you'recalling MPI_Test(), for example, to kick the progress engine, you mayhave to call it a few times to get *all* the fragments processed.
How many fragments are processed in each call to progress can dependon the speed of your hardware and network, etc.
Hi Jeff,

Looking at the source code of MPI_Request_get_status, it...
calls OPAL_CR_NOOP_PROGRESS()
returns true in *flag if request->req_complete
calls opal_progress()
returns false in *flag
What's the difference between OPAL_CR_NOOP_PROGRESS() andopal_progress()? If the request has already completed, does it meanthat since opal_progress() is not called, no further progress is made?

OPAL_CR_NOOP_PROGRESS() seems to be related to checkpoint/restart andis a no-op unless fault-tolerance is being used.


Two questions then...

1. If the request has already completed, does it mean that sinceopal_progress() is not called, no further progress is made?

2. request->req_complete is tested before calling opal_progress(). Isit possible that request->req_complete is now true after callingopal_progress() when this function returns false in *flag?


Thanks,
Shaun

Re: [OMPI users] mca_pml_ob1_send blocks

Reply via email to