Hi again,
I managed to reproduce the "bug" with a simple case (see the cpp file
attached).
I am running this on 2 nodes with 8 cores each. If I run with
mpiexec ./test-mpi-latency.out
then the MPI_Ssend operations take about ~1e-5 second for intra-node
ranks, and ~11 seconds for inter-node ra
Hi again,
I found out that if I add an
MPI_Barrier after the MPI_Recv part, then there is no minute-long latency.
Is it possible that even if MPI_Recv returns, the openib btl does not
guarantee that the acknowledgement is sent promptly ? In other words, is
it possible that the computation follo
Hi,
I have a strange case here. The application is "plink"
(http://pngu.mgh.harvard.edu/~purcell/plink/download.shtml). The
computation/communication pattern of the application is the following :
1- MPI_Init
2- Some single rank computation
3- MPI_Bcast
4- Some single rank computation
5- MPI_Ba