Did you build from the svn repo, or from a tarball? I ask because you don't 
need to run ./autogen.sh (and usually don't) if building from a tarball.

Reason that is important: our configure code checks to see if the code came 
from svn. If it did, the configure assumes it is a developer doing the build, 
and so debug is automatically enabled - which significantly reduces performance.

If it came from a tarball, then we build without debug by default - but we 
still do not optimize. Other MPIs will typically build optimized, and their 
performance is therefore better out-of-the-box. I would have expected that to 
also reflect in the benchmark, but it can be rather hit-and-miss as benchmarks 
are very poor predictors of actual performance.

If you really want to test performance, you should always configure 
--disable-debug CFLAGS=-O3 (or pick your favorite optimization level for your 
selected compiler - the results for a given optimization level are very 
compiler-specific).

Some applications are also sensitive to the relative positioning of ranks. The 
mapping pattern of OMPI can differ significantly from that of other MPIs, so 
you might also want to check and see what ranks went where. For OMPI, you can 
see the mapping by adding --display-map to the mpirun cmd line.

Beyond that, without seeing the mpirun cmd line vs what you did for the other 
MPIs, all we can do is whistle in the dark  :-)


On Dec 27, 2011, at 5:47 PM, Eric Feng wrote:

> I used  "--bind-to-socket --bysocket"  all the time. It helps performance. I 
> never oversubscribed node.
> I have Intel westmere CPUs in each node, all cores will be used for 
> application.
> 
> Open MPI version is 1.5.4.
> 
> The way i did to install openmpi:
> ./autogen.sh
> ./configure --prefix=/mpi/openmpi-1.5.4 --with-openib CC=icc CXX=icpc 
> F77=ifort FC=ifort --with-knem=/opt/knem
> 
> 
> 
> From: Eugene Loh <eugene....@oracle.com>
> To: Open MPI Users <us...@open-mpi.org> 
> Cc: Eric Feng <hpc_benchm...@yahoo.com> 
> Sent: Wednesday, December 28, 2011 1:58 AM
> Subject: Re: [OMPI users] Openmpi performance issue
> 
> If I remember correctly, both Intel MPI and MVAPICH2 bind processes by 
> default.  OMPI does not.  There are many cases where the "bind by default" 
> behavior gives better default performance.  (There are also cases where it 
> can give catastrophically worse performance.)  Anyhow, it seems possible to 
> me that this accounts for the difference you're seeing.
> 
> To play with binding in OMPI, you can try adding "--bind-to-socket 
> --bysocket" to your mpirun command line, though what to try can depend on 
> what version of OMPI you're using as well as details of your processor 
> (HyperThreads?), your application, etc.  There's a FAQ entry at 
> http://www.open-mpi.org/faq/?category=tuning#using-paffinity
> 
> On 12/27/2011 6:45 AM, Ralph Castain wrote:
>> 
>> It depends a lot on the application and how you ran it. Can you provide some 
>> info? For example, if you oversubscribed the node, then we dial down the 
>> performance to provide better cpu sharing. Another point: we don't bind 
>> processes by default while other MPIs do. Etc.
>> 
>> So more info (like the mpirun command line you used, which version you used, 
>> how OMPI was configured, etc.) would help.
>> 
>> 
>> On Dec 27, 2011, at 6:35 AM, Eric Feng wrote:
>> 
>>> Can anyone help me?
>>> I got similar performance issue when comparing to mvapich2 which is much 
>>> faster in each MPI function in real application but similar in IMB 
>>> benchmark.
>>> 
>>> From: Eric Feng <hpc_benchm...@yahoo.com>
>>> To: "us...@open-mpi.org" <us...@open-mpi.org> 
>>> Sent: Friday, December 23, 2011 9:12 PM
>>> Subject: [OMPI users] Openmpi performance issue
>>> 
>>> Hello, 
>>> 
>>> I am running into performance issue with Open MPI, I wish experts here can 
>>> provide me some help,
>>> 
>>> I have one application calls a lot of sendrecv, and isend/irecv, so 
>>> waitall. When I run Intel MPI, it is around 30% faster than OpenMPI.
>>> However if i test sendrecv using IMB, OpenMPI is even faster than Intel 
>>> MPI, but when run with real application, Open MPI is much slower than Intel 
>>> MPI in all MPI functions by looking at profiling results. So this is not 
>>> some function issue, it has a overall drawback somewhere. Can anyone give 
>>> me some suggestions of where to tune to make it run faster with real 
>>> application?
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to