The answer is "it depends"; there's a lot of factors involved.
- What is the topology of your network?
- Where do processes land within the topology of the network?
- What interconnect are you using? (e.g., the openib BTL will
usually use short message RDMA to a limited set of peers as an
optimization)
- How long are your messages?
OMPI does not have any special optimizations for point-to-point
communications for MPI_COMM_WORLD ranks that happen to be powers of
two. Other factors may contribute to make that true for your runs,
but there's nothing hard-coded in Open MPI for that.
On Jun 5, 2007, at 1:10 PM, Andy Georgi wrote:
hi everybody,
i'm new on this list and started using OpenMPI for my parallel
jobs. first step was to measure the latency for blocking
communication functions. now my first question: is it possible that
ordained communication pairs will be optimized?
background:
latency for special processnumbers is nearly 25% smaller, e.g. for
process 1,2,4,8,16,32,64... (every computer scientist should see
the pattern ;-)). it doesn't matter from which process i send the
message if the receiver is one of these processes i have top
latency values. it's not possible that this effect comes through
the network because communication from proc5 to proc32 e.g. is
faster than communication from proc32 to proc5. i've tried it with
OpenMPI for Intel 1.1.4 and 1.2.2 and OpenMPI for PGI 1.2.2. always
the same results. now i think it must be a kind of optimization. if
it's so i would like to know it because then i have an
explanation ;-).
thx and regards,
andy
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Cisco Systems