On Dec 12, 2008, at 3:22 PM, douglas.gupt...@dal.ca wrote:

I could imagine another alternative.  Construct a wrapper function that
intercepts MPI_Recv and turn it into something like<br>
<br>
PMPI_Irecv<br>
while ( ! done ) {<br>
&nbsp;&nbsp;&nbsp; nanosleep(short);<br>
&nbsp;&nbsp;&nbsp; PMPI_Test(&amp;done);<br>
}<br>
<br>
I don't know how useful this would be for your particular case.<br>
<br>

Thank you for the suggestion.  I didn't know about "PMPI_Irecv" (my
question was what/where did the "P" prefix to MPI come from?) so I
went back to the MPI standard, and re-read the description of
"mpi_send" and "mpi_recv".

The "P" is MPI's profiling interface. See chapter 14 in the MPI-2.1 doc.

Based on my re-read of the MPI standard, it appears that I may have
slightly mis-stated my issue.  The spin is probably taking place in
"mpi_send".  "mpi_send", according to my understanding of the MPI
standard, may not exit until a matching "mpi_recv" has been initiated,
or completed.  At least that is the conclusion I came to.

Perhaps something like this:

int MPI_Send(...) {
   MPI_Request req;
   int flag;
   PMPI_Isend(..., &req);
   do {
      nanosleep(short);
      PMPI_Test(&req, &flag, MPI_STATUS_IGNORE);
   } while (!flag);
}

That is, *you* provide MPI_Send and intercept all your apps calls to MPI_Send. But you implement it by doing a non-blocking send and sleeping and polling MPI to know when it's done. Of course, you don't have to implement this as MPI_Send -- you could always have your_func_prefix_send(...) instead of explicitly using the MPI profiling interface. But using the profiling interface allows you to swap in/out different implementations of MPI_Send (etc.) at link time, if that's desirable to you.

Looping over sleep/test is not the most efficient way of doing it, but it may be suitable for your purposes.

However my complaint - sorry, I wish I could think of a better word -
remains.

No worries!  :-)

It appears that openmpi spin-waits, as opposed to, say,
going to sleep and waiting for a wake-up call.  Like a semaphore - if
those things still exist.


Correct. Most MPI's do at least some form of spin waiting (some do have the ability to block after a while). As mentioned on this thread, we have it on our roadmap, but the timing of when it happens is -- as yet -- unknown. We are at driven by customer/user input, though, so if lots of people ask for this, there's more of a chance for it getting done than if no one is asking for it. :-)

--
Jeff Squyres
Cisco Systems

Reply via email to