Re: [OMPI users] Performance question about OpenMPI and MVAPICH2 on IB

Craig Tierney Fri, 7 Aug 2009 09:13:22 -0400

Terry Dontje wrote:

Craig,
Did your affinity script bind the processes per socket or linearly tocores. If the former you'll want to look at using rankfiles and placethe ranks based on sockets. TWe have found this especially useful ifyou are not running fully subscribed on your machines.


The script binds them to sockets and also binds memory per node.
It is smart enough that if the machine_file does not use all
the cores (because the user reordered them) then the script will
lay out the tasks evenly between the two sockets.

Also, if you think the main issue is collectives performance you maywant to try using the hierarchical and SM collectives. However, beforewarned we are right now trying to pound out some errors with thesemodules. To enable them you add the following parameters "--mcacoll_hierarch_priority 100 --mca coll_sm_priority 100". We would bevery interested in any results you get (failures, improvements,non-improvements).


I don't know what it is slow.  OpenMPI is so flexible in how the
stack can be tuned.  But I also have 100s of users runing dozens
of major codes, and what I need is a set of options that 'just work'
in most cases.

I will try the above options and get back to you.

Craig

thanks,

--td

Message: 4
Date: Thu, 06 Aug 2009 17:03:08 -0600
From: Craig Tierney <craig.tier...@noaa.gov>
Subject: Re: [OMPI users] Performance question about OpenMPI and
    MVAPICH2 on    IB
To: Open MPI Users <us...@open-mpi.org>
Message-ID: <4a7b612c.8070...@noaa.gov>
Content-Type: text/plain; charset=ISO-8859-1

A followup....

Part of problem was affinity.  I had written a script to do processor
and memory affinity (which works fine with MVAPICH2).  It is an
idea that I got from TACC.  However, the script didn't seem to
work correctly with OpenMPI (or I still have bugs).

Setting --mca mpi_paffinity_alone 1 made things better.  However,
the performance is still not as good:

Cores   Mvapich2    Openmpi
---------------------------
   8      17.3        17.3
  16      31.7        31.5
  32      62.9        62.8
  64     110.8       108.0
 128     219.2       201.4
 256     384.5       342.7
 512     687.2       537.6

The performance number is GFlops (so larger is better).

The first few numbers show that the executable is the right
speed.  I verified that IB is being used by using OMB and
checking latency and bandwidth.  Those numbers are what I
expect (3GB/s, 1.5mu/s for QDR).

However, the Openmpi version is not scaling as well.  Any
ideas on why that might be the case?

Thanks,
Craig


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] Performance question about OpenMPI and MVAPICH2 on IB

Reply via email to