Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

Bernd Dammann Fri, 28 Feb 2014 02:49:57 -0500 (EST)

On 2/27/14 16:47 PM, Dave Love wrote:

Bernd Dammann <b...@cc.dtu.dk> writes:

Hi,

I found this thread from before Christmas, and I wondered what the
status of this problem is.  We experience the same problems since our
upgrade to Scientific Linux 6.4, kernel 2.6.32-431.1.2.el6.x86_64, and
OpenMPI 1.6.5.

Users have reported severe slowdowns in all kinds of applications,
like VASP, OpenFOAM, etc.


I'm surprised a kernel change should be related to core binding, if
that's the issue, or caused your slowdown.  We were running that kernel
OK until recently with those sort of applications and that OMPI version.

Maybe I should say, that we moved from SL 6.1 and OMPI 1.4.x to SL 6.4with the above kernel, and OMPI 1.6.5 - which means a major upgrade ofour cluster.

After the upgrade, users reported those slowdowns, and a search on thislist showed, that other sites had the same (or similar issues) with thiskernel and OMPI version combination.

(The change to the default alltoallv collective algorithm in the OMPI
1.6 series, discussed in the archives, might affect you if you upgraded
through it.)


OK, thanks - I take a look at it.

Using the workaround '--bind-to-core' does only make sense for those
jobs, that allocate full nodes, but the majority of our jobs don't do
that.


I don't consider it a workaround.  Just use a resource manager that
sorts it out for you.  For what it's worth, a recipe for SGE/OMPI is at
<http://arc.liv.ac.uk/SGE/howto/sge-configs.html#_core_binding>.  We're
happy with that (and seem to be at least on a par with Intel using
OMPI+GCC+OpenBLAS) now users automatically get binding.

We use Moab/Torque, so we could use cpusets (but that has had some otherside effects earlier, so we did not implement it in our setup).

Regardless of that, it looks strange to me, that this combination ofkernel and OMPI has such a negative side effect on application performance.


Rgds,
Bernd

Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

Reply via email to