Re: [OMPI users] trouble using --mca mpi_yield_when_idle 1

Jeff Squyres Fri, 12 Dec 2008 16:47:16 -0500

On Dec 12, 2008, at 3:22 PM, douglas.gupt...@dal.ca wrote:

I could imagine another alternative.  Construct a wrapperfunction that

intercepts MPI_Recv and turn it into something like<br>
<br>
PMPI_Irecv<br>
while ( ! done ) {<br>
&nbsp;&nbsp;&nbsp; nanosleep(short);<br>
&nbsp;&nbsp;&nbsp; PMPI_Test(&amp;done);<br>
}<br>
<br>
I don't know how useful this would be for your particular case.<br>
<br>


Thank you for the suggestion.  I didn't know about "PMPI_Irecv" (my
question was what/where did the "P" prefix to MPI come from?) so I
went back to the MPI standard, and re-read the description of
"mpi_send" and "mpi_recv".

The "P" is MPI's profiling interface. See chapter 14 in the MPI-2.1doc.

Based on my re-read of the MPI standard, it appears that I may have
slightly mis-stated my issue.  The spin is probably taking place in
"mpi_send".  "mpi_send", according to my understanding of the MPI
standard, may not exit until a matching "mpi_recv" has been initiated,
or completed.  At least that is the conclusion I came to.


Perhaps something like this:

int MPI_Send(...) {
   MPI_Request req;
   int flag;
   PMPI_Isend(..., &req);
   do {
      nanosleep(short);
      PMPI_Test(&req, &flag, MPI_STATUS_IGNORE);
   } while (!flag);
}

That is, *you* provide MPI_Send and intercept all your apps calls toMPI_Send. But you implement it by doing a non-blocking send andsleeping and polling MPI to know when it's done. Of course, you don'thave to implement this as MPI_Send -- you could always haveyour_func_prefix_send(...) instead of explicitly using the MPIprofiling interface. But using the profiling interface allows you toswap in/out different implementations of MPI_Send (etc.) at link time,if that's desirable to you.

Looping over sleep/test is not the most efficient way of doing it, butit may be suitable for your purposes.

However my complaint - sorry, I wish I could think of a better word -
remains.


No worries!  :-)

It appears that openmpi spin-waits, as opposed to, say,
going to sleep and waiting for a wake-up call.  Like a semaphore - if
those things still exist.

Correct. Most MPI's do at least some form of spin waiting (some dohave the ability to block after a while). As mentioned on thisthread, we have it on our roadmap, but the timing of when it happensis -- as yet -- unknown. We are at driven by customer/user input,though, so if lots of people ask for this, there's more of a chancefor it getting done than if no one is asking for it. :-)


--
Jeff Squyres
Cisco Systems

Re: [OMPI users] trouble using --mca mpi_yield_when_idle 1

Reply via email to