The answer is "it depends"; there's a lot of factors involved.

- What is the topology of your network?
- Where do processes land within the topology of the network?
- What interconnect are you using? (e.g., the openib BTL will usually use short message RDMA to a limited set of peers as an optimization)
- How long are your messages?

OMPI does not have any special optimizations for point-to-point communications for MPI_COMM_WORLD ranks that happen to be powers of two. Other factors may contribute to make that true for your runs, but there's nothing hard-coded in Open MPI for that.



On Jun 5, 2007, at 1:10 PM, Andy Georgi wrote:

hi everybody,

i'm new on this list and started using OpenMPI for my parallel jobs. first step was to measure the latency for blocking communication functions. now my first question: is it possible that ordained communication pairs will be optimized?

background:

latency for special processnumbers is nearly 25% smaller, e.g. for process 1,2,4,8,16,32,64... (every computer scientist should see the pattern ;-)). it doesn't matter from which process i send the message if the receiver is one of these processes i have top latency values. it's not possible that this effect comes through the network because communication from proc5 to proc32 e.g. is faster than communication from proc32 to proc5. i've tried it with OpenMPI for Intel 1.1.4 and 1.2.2 and OpenMPI for PGI 1.2.2. always the same results. now i think it must be a kind of optimization. if it's so i would like to know it because then i have an explanation ;-).

thx and regards,

andy
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Jeff Squyres
Cisco Systems

Reply via email to