Here is what I get on my CentOS7 system using the 1.8.5 about to be released:

When built as a debug build:

07:41:34  (v1.8) /home/common/openmpi/ompi-release/orte/test/mpi$ time mpirun 
-host bend001 -n 2 ./mpi_no_op

real    0m0.120s
user    0m0.064s
sys     0m0.090s
07:42:05  (v1.8) /home/common/openmpi/ompi-release/orte/test/mpi$ time mpirun 
-host bend001 -n 2 -mca btl tcp,self ./mpi_no_op

real    0m0.114s
user    0m0.065s
sys     0m0.079s


When built with -O2:

07:52:35  (v1.8) /home/common/openmpi/ompi-release/orte/test/mpi$ time mpirun 
-host bend001 -n 2 ./mpi_no_op

real    0m0.113s
user    0m0.050s
sys     0m0.095s
07:52:40  (v1.8) /home/common/openmpi/ompi-release/orte/test/mpi$ time mpirun 
-host bend001 -n 2 --mca btl tcp,self ./mpi_no_op

real    0m0.110s
user    0m0.054s
sys     0m0.086s

Note that I ran this on only one node per your report. However, one difference 
between 1.6 and 1.8 is that the latter will start daemons on every node in its 
allocation prior to launching the job, while the former only started daemons on 
the nodes it was going to use. We do that so we can sense the hardware topology 
of each node and thus provide a wider range of mapping options - in production, 
it is unusual for someone to request more nodes than they intend to use.

If you have a hostfile (or allocation of some kind) that is larger than one 
node, then the extra time is likely being used by this DVM-style launch. You 
might try adding the -host flag (as I did above) to cut that down.


> On Apr 28, 2015, at 2:35 AM, Steven Vancoillie 
> <steven.vancoil...@teokem.lu.se> wrote:
> 
> Dear OpenMPI developers,
> 
> I've run into a recurring problem that was addressed before on this
> list, of which subject was "Performance issue of mpirun/mpi_init".
> I found the original thread here:
> http://comments.gmane.org/gmane.comp.clustering.open-mpi.user/21346
> 
> My former colleague noted that with OpenMPI version 1.8.1 he got bad
> performance for a simple C program that only did MPI initialization.
> This was apparently addressed in this ticket:
> https://svn.open-mpi.org/trac/ompi/ticket/4510#comment:1
> with my colleague noting that this solved the problem and version
> 1.8.1 r31402 did not have the problem any more.
> 
> Unfortunately I can't confirm this, as I'm still having performance
> problems with 1.8.4, which (I assume) includes that fix from 1.8.1.
> 
> I decided to independently repeat the tests, so I've written the
> following small Fortran test program "testme.f90":
> 
> program testme
> call mpi_init(ierr)
> call mpi_finalize(ierr)
> end program
> 
> I then proceeded with 1.6.5, 1.8.1, and 1.8.4 to create a binary:
> /opt/openmpi-1.6.5/bin/mpif90 testme.f90 -o testme-165.exe
> /opt/openmpi-1.8.1/bin/mpif90 testme.f90 -o testme-181.exe
> /opt/openmpi-1.8.4/bin/mpif90 testme.f90 -o testme-184.exe
> 
> Timings were performed with the "time" program, running with 2
> MPI processes on a single node.
> 
> time /opt/openmpi-1.6.5/bin/mpirun -np 2 testme-165.exe
> 
> real0m1.022s
> user0m0.019s
> sys0m0.011s
> 
> As my former colleague noted, using "OMPI_MCA_btl=tcp,self" brings
> down the time to that of other typical MPI implementations:
> 
> export OMPI_MCA_btl=tcp,self
> time /opt/openmpi-1.6.5/bin/mpirun -np 2 testme-165.exe
> 
> real0m0.020s
> user0m0.014s
> sys0m0.014s
> 
> Now, when going to 1.8.1, the timings are better initially, but
> unaffected by the OMPI_MCA_btl setting:
> 
> time /opt/openmpi-1.8.1/bin/mpirun -np 2 testme-181.exe
> 
> real0m0.620s
> user0m0.267s
> sys0m0.253s
> 
> When using version 1.8.4, the timings _are_ indeed better compared to
> 1.8.1 (but also not affected by the OMPI_MCA_btl setting):
> 
> time /opt/openmpi-1.8.4/bin/mpirun -np 2 testme-184.exe
> 
> real0m0.376s
> user0m0.170s
> sys0m0.179s
> 
> However, even though there is an improvement over 1.8.1, the
> performance of 1.8.4 is not even close to that of either 1.6.5 (with
> the OMPI_MCA_btl setting) nor the performance of other MPI
> implementations.
> 
> The reason we care about this is that our test suite runs a lot of
> short tests that consists of independent executables that are run
> thousands of times, so each time calling mpi_init. This increases the
> total time of running the entire test suite from around 2-3 hours
> (with MPICH or OpenMPI 1.6.5 with OMPI_MCA_btl=tcp,self) to around
> 9 hours with OpenMPI 1.8.4.
> 
> kind regards,
> Steven
> 
> -- 
> Steven Vancoillie
> Theoretical Chemistry
> Lund University
> P.O.B 124
> S-221 00 Lund
> Sweden
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2015/04/26801.php

Reply via email to