Dear OpenMPI developers,

I've run into a recurring problem that was addressed before on this
list, of which subject was "Performance issue of mpirun/mpi_init".
I found the original thread here:
http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/21346

My former colleague noted that with OpenMPI version 1.8.1 he got bad
performance for a simple C program that only did MPI initialization.
This was apparently addressed in this ticket:
https://svn.open-mpi.org/trac/ompi/ticket/4510#comment:1
with my colleague noting that this solved the problem and version
1.8.1 r31402 did not have the problem any more.

Unfortunately I can't confirm this, as I'm still having performance
problems with 1.8.4, which (I assume) includes that fix from 1.8.1.

I decided to independently repeat the tests, so I've written the
following small Fortran test program "testme.f90":

program testme
call mpi_init(ierr)
call mpi_finalize(ierr)
end program

I then proceeded with 1.6.5, 1.8.1, and 1.8.4 to create a binary:
/opt/openmpi-1.6.5/bin/mpif90 testme.f90 -o testme-165.exe
/opt/openmpi-1.8.1/bin/mpif90 testme.f90 -o testme-181.exe
/opt/openmpi-1.8.4/bin/mpif90 testme.f90 -o testme-184.exe

Timings were performed with the "time" program, running with 2
MPI processes on a single node.

time /opt/openmpi-1.6.5/bin/mpirun -np 2 testme-165.exe

real0m1.022s
user0m0.019s
sys0m0.011s

As my former colleague noted, using "OMPI_MCA_btl=tcp,self" brings
down the time to that of other typical MPI implementations:

export OMPI_MCA_btl=tcp,self
time /opt/openmpi-1.6.5/bin/mpirun -np 2 testme-165.exe

real0m0.020s
user0m0.014s
sys0m0.014s

Now, when going to 1.8.1, the timings are better initially, but
unaffected by the OMPI_MCA_btl setting:

time /opt/openmpi-1.8.1/bin/mpirun -np 2 testme-181.exe

real0m0.620s
user0m0.267s
sys0m0.253s

When using version 1.8.4, the timings _are_ indeed better compared to
1.8.1 (but also not affected by the OMPI_MCA_btl setting):

time /opt/openmpi-1.8.4/bin/mpirun -np 2 testme-184.exe

real0m0.376s
user0m0.170s
sys0m0.179s

However, even though there is an improvement over 1.8.1, the
performance of 1.8.4 is not even close to that of either 1.6.5 (with
the OMPI_MCA_btl setting) nor the performance of other MPI
implementations.

The reason we care about this is that our test suite runs a lot of
short tests that consists of independent executables that are run
thousands of times, so each time calling mpi_init. This increases the
total time of running the entire test suite from around 2-3 hours
(with MPICH or OpenMPI 1.6.5 with OMPI_MCA_btl=tcp,self) to around
9 hours with OpenMPI 1.8.4.

kind regards,
Steven

-- 
Steven Vancoillie
Theoretical Chemistry
Lund University
P.O.B 124
S-221 00 Lund
Sweden

Reply via email to