On Feb 29, 2012, at 2:57 PM, Jingcha Joba wrote:

> So if I understand correctly, if a message size is smaller than it will use 
> the MPI way (non-RDMA, 2 way communication), if its larger, then it would use 
> the Open Fabrics, by using the ibverbs (and ofed stack) instead of using the 
> MPI's stack?

Er... no.

So let's talk MPI-over-OpenFabrics-verbs specifically.

All MPI communication calls will use verbs under the covers.  They may use 
verbs send/receive semantics in some cases, and RDMA semantics in other cases.  
"It depends" -- on a lot of things, actually.  It's hard to come up with a good 
rule of thumb for when it uses one or the other; this is one of the reasons 
that the openib BTL code is so complex.  :-)

The main points here are:

1. you can trust the openib BTL to do the Best thing possible to get the 
message to the other side.  Regardless of whether that message is an MPI_SEND 
or an MPI_PUT (for example).

2. MPI_PUT does not necessarily == verbs RDMA write (and likewise, MPI_GET does 
not necessarily == verbs RDMA read).

> If so, could that be the reason why the MPI_Put "hangs" when sending a 
> message more than 512KB (or may be 1MB)?

No.  I'm guessing that there's some kind of bug in the MPI_PUT implementation.

> Also is there a way to know if for a particular MPI call, OF uses send/recv 
> or RDMA exchange?

Not really.

More specifically: all things being equal, you don't care which is used.  You 
just want your message to get to the receiver/target as fast as possible.  One 
of the main ideas of MPI is to hide those kinds of details from the user.  
I.e., you call MPI_SEND.  A miracle occurs.  The message is received on the 
other side.

:-)

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to