On Feb 29, 2008, at 6:25 PM, Elvedin Trnjanin wrote:

I'm using a "ping pong" program to approximate bandwidth and latency of
various messages sizes and I notice when doing various transfers (eg.
async) that the maximum bandwidth isn't the system's maximum bandwidth. I've looked through the FAQ and I haven't noticed this being covered but
how does OpenMPI handle loopback communication? Is it still over a
network interconnect or some sort of shared memory copy?


There are two kinds of loopback:

1. messages exchanged between two MPI processes on the same host. This can be handled by most of OMPI's devices, but the best/fastest is usually shared memory (i.e., the "sm" BTL).

2. messages exchanges between a single MPI process. This is handled by the "self" OMPI device because it's just a memcpy within a single process.

So you'd typically want to run (assuming you have an IB network):

    mpirun --mca btl openib,self,sm ....

That being said, OMPI should usually pick the relevant BTL modules for you (to include self and sm).

--
Jeff Squyres
Cisco Systems

Reply via email to