David Zhang wrote:
Blocking send/recv, as the name suggest, stop processing your master and slave code until the data is received on the slave side.
Just to clarify...

If you use point-to-point send and receive calls, you can make the block/nonblock decision independently on the send and receive sides.  E.g., use blocking send and nonblocking receive.  Or nonblocking send and blocking receive.  You get the idea.

Blocking on the send side only means blocking until the message has left the user's buffer on the send side.  It does not guarantee that the data has been received on the other end.

I agree with Bill that performance portability is an issue.  That is, the MPI standard itself doesn't really provide any guarantees here about what is fastest.  Perhaps polling this mailing list will be helpful, but if you are looking for "the fastest" solution regardless of which MPI implementation you use (and which interconnect you use... which might be determined at run time) you will probably be disappointed.

Using a collective call like MPI_Gather may be worthwhile, but it doesn't deploy additional threads, and additional threads could indeed help in certain cases.

In addition to MPI implementation and which interconnect (or BTL) one uses, another important variable is message length.  Short messages may be sent "eagerly" while long messages may involve more synchronization between master and slaves.
Nonblocking send/recv wouldn't stop, instead you must check the status on the slave side to see if data has been sent.
Yes and no.  Again, data can be sent from the master but not yet received by the slave (if the MPI implementation buffers the data somewhere in-between).
Nonblocking is faster on the master side because the master doesn't need to wait for the slave to receive the data to continue.
???  For most sends, the master has to wait only on the data to leave the user send buffer.
So when you say you want your master to send "as fast as possible", I suppose you meant get back to running your code as soon as possible.  In that case you would want nonblocking.  However when you say you want the slaves to receive data faster, it seems you're implying the actual data transmission across the network.  I believe the data transmission speed is not dependent on whether the it is blocking or nonblocking.

On Sun, Jan 30, 2011 at 11:09 AM, Toon Knapen <toon.kna...@gmail.com> wrote:
Hi all,

If I have a master-process that needs to send a chunk of (different) data to each of my N slave processes as fast as possible, would I receive the chunk in each of the slaves faster if the master would launch N threads each doing a blocking send or would it be better to launch N nonblocking sends in the master.

I'm currently using OpenMPI on ethernet but might the approach be different with different types of networks ?

thanks in advance,

Reply via email to