Hi All,
My MPI program's basic task consists of regularly establishing point-to-point
communication with other procs via MPI_Alltoall, and then to communicate data.
I tested it on two HPC clusters with 32-256 MPI tasks. One of the systems
(HPC1) this custom collective runs flawlessly, while on
Hello,
Open-MPI 1.4.3 on Mellanox Infiniband hardware gives a latency of 250
microseconds with 256 MPI ranks on super-computer A (name is colosse).
The same software gives a latency of 10 microseconds with MVAPICH2 and QLogic
Infiniband hardware with 512 MPI ranks on super-computer B (name is g