On Dec 12, 2008, at 3:22 PM, douglas.gupt...@dal.ca wrote:
I could imagine another alternative. Construct a wrapper
function that
intercepts MPI_Recv and turn it into something like<br>
<br>
PMPI_Irecv<br>
while ( ! done ) {<br>
nanosleep(short);<br>
PMPI_Test(&done);<br>
}<br>
<br>
I don't know how useful this would be for your particular case.<br>
<br>
Thank you for the suggestion. I didn't know about "PMPI_Irecv" (my
question was what/where did the "P" prefix to MPI come from?) so I
went back to the MPI standard, and re-read the description of
"mpi_send" and "mpi_recv".
The "P" is MPI's profiling interface. See chapter 14 in the MPI-2.1
doc.
Based on my re-read of the MPI standard, it appears that I may have
slightly mis-stated my issue. The spin is probably taking place in
"mpi_send". "mpi_send", according to my understanding of the MPI
standard, may not exit until a matching "mpi_recv" has been initiated,
or completed. At least that is the conclusion I came to.
Perhaps something like this:
int MPI_Send(...) {
MPI_Request req;
int flag;
PMPI_Isend(..., &req);
do {
nanosleep(short);
PMPI_Test(&req, &flag, MPI_STATUS_IGNORE);
} while (!flag);
}
That is, *you* provide MPI_Send and intercept all your apps calls to
MPI_Send. But you implement it by doing a non-blocking send and
sleeping and polling MPI to know when it's done. Of course, you don't
have to implement this as MPI_Send -- you could always have
your_func_prefix_send(...) instead of explicitly using the MPI
profiling interface. But using the profiling interface allows you to
swap in/out different implementations of MPI_Send (etc.) at link time,
if that's desirable to you.
Looping over sleep/test is not the most efficient way of doing it, but
it may be suitable for your purposes.
However my complaint - sorry, I wish I could think of a better word -
remains.
No worries! :-)
It appears that openmpi spin-waits, as opposed to, say,
going to sleep and waiting for a wake-up call. Like a semaphore - if
those things still exist.
Correct. Most MPI's do at least some form of spin waiting (some do
have the ability to block after a while). As mentioned on this
thread, we have it on our roadmap, but the timing of when it happens
is -- as yet -- unknown. We are at driven by customer/user input,
though, so if lots of people ask for this, there's more of a chance
for it getting done than if no one is asking for it. :-)
--
Jeff Squyres
Cisco Systems