The answer is "it depends"; there's a lot of factors involved.
- What is the topology of your network?
- Where do processes land within the topology of the network?
- What interconnect are you using? (e.g., the openib BTL will
usually use short message RDMA to a limited set of peers as an
op
hi everybody,
i'm new on this list and started using OpenMPI for my parallel jobs. first step
was to measure the latency for blocking communication functions. now my first
question: is it possible that ordained communication pairs will be optimized?
background:
latency for special processnumbe