On Feb 29, 2008, at 6:25 PM, Elvedin Trnjanin wrote:
I'm using a "ping pong" program to approximate bandwidth and latency
of
various messages sizes and I notice when doing various transfers (eg.
async) that the maximum bandwidth isn't the system's maximum
bandwidth.
I've looked through the FAQ and I haven't noticed this being covered
but
how does OpenMPI handle loopback communication? Is it still over a
network interconnect or some sort of shared memory copy?
There are two kinds of loopback:
1. messages exchanged between two MPI processes on the same host.
This can be handled by most of OMPI's devices, but the best/fastest is
usually shared memory (i.e., the "sm" BTL).
2. messages exchanges between a single MPI process. This is handled
by the "self" OMPI device because it's just a memcpy within a single
process.
So you'd typically want to run (assuming you have an IB network):
mpirun --mca btl openib,self,sm ....
That being said, OMPI should usually pick the relevant BTL modules for
you (to include self and sm).
--
Jeff Squyres
Cisco Systems