Re: [OMPI users] efficient strategy with temporary message copy

Jeff Squyres (jsquyres) Mon, 17 Mar 2014 11:31:47 -0400 (EDT)

On Mar 16, 2014, at 10:24 PM, christophe petit <christophe.peti...@gmail.com> 
wrote:


> I am studying the optimization strategy when the number of communication 
> functions in a codeis high.
> 
> My courses on MPI say two things for optimization which are contradictory :
> 
> 1*) You have to use temporary message copy to allow non-blocking sending and 
> uncouple the sending and receiving

There's a lot of schools of thought here, and the real answer is going to 
depend on your application.

If the message is "short" (and the exact definition of "short" depends on your 
platform -- it varies depending on your CPU, your memory, your CPU/memory 
interconnect, ...etc.), then copying to a pre-allocated bounce buffer is 
typically a good idea.  That lets you keep using your "real" buffer and not 
have to wait until communication is done.

For "long" messages, the equation is a bit different.  If "long" isn't 
"enormous", you might be able to have N buffers available, and simply work on 1 
of them at a time in your main application and use the others for ongoing 
non-blocking communication.  This is sometimes called "shadow" copies, or 
"ghost" copies.

Such shadow copies are most useful when you receive something each iteration, 
for example.  For example, something like this:

  buffer[0] = malloc(...);
  buffer[1] = malloc(...);
  current = 0;
  while (still_doing_iterations) {
      MPI_Irecv(buffer[current], ..., &req);
      /// work on buffer[current - 1]
      MPI_Wait(req, MPI_STATUS_IGNORE);
      current = 1 - current;
  }

You get the idea.

> 2*) Avoid using temporary message copy because the copy will add extra cost 
> on execution time. 

It will, if the memcpy cost is significant (especially compared to the network 
time to send it).  If the memcpy is small/insignificant, then don't worry about 
it.

You'll need to determine where this crossover point is, however.

Also keep in mind that MPI and/or the underlying network stack will likely be 
doing these kinds of things under the covers for you.  Indeed, if you send 
short messages -- even via MPI_SEND -- it may return "immediately", indicating 
that MPI says it's safe for you to use the send buffer.  But that doesn't mean 
that the message has even actually left the current server and gone out onto 
the network yet (i.e., some other layer below you may have just done a memcpy 
because it was a short message, and the processing/sending of that message is 
still ongoing).

> And then, we are adviced to do : 
> 
> - replace MPI_SEND by MPI_SSEND (synchroneous blocking sending) : it is said 
> that execution is divided by a factor 2

This very, very much depends on your application.

MPI_SSEND won't return until the receiver has started to receive the message.

For some communication patterns, putting in this additional level of 
synchronization is helpful -- it keeps all MPI processes in tighter 
synchronization and you might experience less jitter, etc.  And therefore 
overall execution time is faster.

But for others, it adds unnecessary delay.

I'd say it's an over-generalization that simply replacing MPI_SEND with 
MPI_SSEND always reduces execution time by 2.

> - use MPI_ISSEND and MPI_IRECV with MPI_WAIT function to synchronize 
> (synchroneous non-blocking sending) : it is said that execution is divided by 
> a factor 3

Again, it depends on the app.  Generally, non-blocking communication is better 
-- *if your app can effectively overlap communication and computation*.

If your app doesn't take advantage of this overlap, then you won't see such 
performance benefits.  For example:

   MPI_Isend(buffer, ..., req);
   MPI_Wait(&req, ...);

Technically, the above uses ISEND and WAIT... but it's actually probably going 
to be *slower* than using MPI_SEND because you've made multiple function calls 
with no additional work between the two -- so the app didn't effectively 
overlap the communication with any local computation.  Hence: no performance 
benefit.

> So what's the best optimization ? Do we have to use temporary message copy or 
> not and if yes, what's the case for ?

As you can probably see from my text above, the answer is: it depends.  :-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI users] efficient strategy with temporary message copy

Reply via email to