Hi, everyone,

I ran some bandwidth tests on two different systems with Mellanox IB (FDR and EDR). I compiled the three supported versions of openmpi (1.10.6, 2.0.2, 2.1.0) and measured the time it takes to send/receive 4MB arrays of doubles betweentwo hosts connected to the same IB switch. MPI_Send/MPI_Recv were performed 1000 times, andthe table below gives the average bandwidth obtained [MB/s]:

OpenMPI     FDR           EDR
1.10.6       6203.0        11271.1
2.0.2 5128.4 11948.0
2.1.0 5095.1 11947.2

openib btl was used to transfer the data. The resultsare puzzling: it seems that something changed starting from version 2.x, and the FDR system performs much worse than with the prior 1.10.x release. On the EDR system I see the opposite (v2.x are better), but the difference is not so dramatic.

Did anyone experience similar behavior? Is this due to OpenMPI, or something else? The two systems run Centos (FDR:6.8, EDR:7.3), and Mellanox OFED with a minor version difference.

I'd appreciate any thoughts.

Thanks a lot!

Marcin Krotkiewski


_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to