Using the "--mca btl tcp,self" switch to mpirun solved all the issues (in 
addition to
the requirement to include the --mca btl_tcp_if_include eth0 switch). I believe
the "--mca btl tcp,self" switch limits inter-process communication within a 
node to using the TCP
loopback rather than shared memory. I should also point out that all of the 
nodes
on this cluster feature NUMA architecture.

Will using the "--mca btl tcp,self" switch to mpirun result in any degraded 
performance
issues over using shared memory?

-Bill Lane

________________________________________
From: users [users-boun...@open-mpi.org] on behalf of Jeff Squyres (jsquyres) 
[jsquy...@cisco.com]
Sent: Tuesday, July 22, 2014 2:29 PM
To: Open MPI User's List
Subject: Re: [OMPI users] Mpirun 1.5.4  problems when request > 28 slots

Hyperthreading is pretty great for non-HPC applications, which is why Intel 
makes it.  But hyperthreading *generally* does not help HPC application 
performance.  You're basically halving several on-chip resources / queues / 
pipelines, and that can hurt for performance-hungry HPC applications.

This is a per-application issue, of course, so YMMV.  But the general wisdom -- 
even with Intel Ivy Bridge-class chips -- is to disable hperthreading for HPC 
apps.

That being said, Open MPI started supporting hyperthreading properly somewhere 
in the 1.5/1.6 series (I don't remember the exact version).  These are among 
the reasons that we're urging you to upgrade to at least 1.6.5.  "Supporting 
hyperthreading properly" means: when you say "bind to core", OMPI will 
recognize that each core is composed of N hyperthreads, and will bind to all of 
them (vs. bind each MPI process to a Linux virtual processor, which may be a 
core or a hyperthread).

So if you're running in a bind-to-core situation, if it's a "before OMPI 
supporter HT properly" version, then you'll bind 2 MPI processes to a single 
core, and that will likely be pretty terrible for overall performance.

Does that help?


On Jul 22, 2014, at 5:18 PM, Lane, William <william.l...@cshs.org> wrote:

> Ralph,
>
> The 32 slot systems/nodes I'm running my openMPI test code on only have
> 16 physical cores, the rest of the slots are hyperthreads. I've done some more
> testing and noticed that if I limit the number of slots per node to 8
> (via -npernode 8) everything works and 8 slots are used from each system/node:
>
> mpirun -np 32 -npernode 8 --prefix /usr/lib64/openmpi --hostfile hostfile 
> --mca btl_tcp_if_include eth0 --mca
>     pls_gridengine_verbose 1 /hpc/home/lanew/mpi/openmpi/ProcessColors2
>
> However, when I run this same testcode on an older cluster (much older version
> of openMPI [1.3.3]) I have no problems using all the cores (including 
> hyperthreading
> cores). The Intel CPU's used are different in each case: the older cluster 
> uses
> two, 6 core Xeons with 12 hyperthreads while the new cluster uses two,
> 8 core Sandybridges with 16 hyperthreads apiece.
>
> Is hyperthreading an issue with openMPI? Should hyperthreading always be 
> turned
> off for openMPI apps?
>
> Thanks for you time,
>
> -Bill Lane
>
>
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Tuesday, July 22, 2014 7:57 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>
> Hmmm...that's not a "bug", but just a packaging issue with the way CentOS 
> distributed some variants of OMPI that requires you install/update things in 
> a specific order.
>
> On Jul 20, 2014, at 11:34 PM, Lane, William <william.l...@cshs.org> wrote:
>
>> Please see:
>>
>> http://bugs.centos.org/view.php?id=5812
>>
>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>> [r...@open-mpi.org]
>> Sent: Sunday, July 20, 2014 9:30 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>>
>> I'm unaware of any CentOS-OMPI bug, and I've been using CentOS throughout 
>> the 6.x series running OMPI 1.6.x and above.
>>
>> I can't speak to the older versions of CentOS and/or the older versions of 
>> OMPI.
>>
>> On Jul 19, 2014, at 8:14 PM, Lane, William <william.l...@cshs.org> wrote:
>>
>>> Yes there is a second HPC Sun Grid Engine cluster on which I've run
>>> this openMPI test code dozens of times on upwards of 400 slots
>>> through SGE using qsub and qrsh, but this was using a much
>>> older version of openMPI (1.3.3 I believe). On that particular cluster the
>>> open files hard and soft limits were an issue.
>>>
>>> I have noticed that there has been a new (as of July 2014) CentOS openMPI 
>>> bug that
>>> occurs when CentOS is upgraded from 6.2 to 6.3. I'm not sure if that
>>> bug applies to this situation though.
>>>
>>> This particular problem occurs whether or not I submit jobs through SGE 
>>> (via qrsh
>>> or qsub) or outside of SGE which leads me to believe it is an openMPI 
>>> and/or CentOS
>>> issue.
>>>
>>> -Bill Lane
>>>
>>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>>> [r...@open-mpi.org]
>>> Sent: Saturday, July 19, 2014 3:21 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>>>
>>> Not for this test case size. You should be just fine with the default 
>>> values.
>>>
>>> If I understand you correctly, you've run this app at scale before on 
>>> another cluster without problem?
>>>
>>> On Jul 19, 2014, at 1:34 PM, Lane, William <william.l...@cshs.org> wrote:
>>>
>>>> Ralph,
>>>>
>>>> It's hard to imagine it's the openMPI code because I've tested this code
>>>> extensively on another cluster with 400 nodes and never had any problems.
>>>> But I'll try using the hello_c example in any case. Is it still 
>>>> recommended to
>>>> raise the open files soft and hard limits to 4096? Or should even larger 
>>>> values
>>>> be necessary?
>>>>
>>>> Thank you for your help.
>>>>
>>>> -Bill Lane
>>>>
>>>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>>>> [r...@open-mpi.org]
>>>> Sent: Saturday, July 19, 2014 8:07 AM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>>>>
>>>> That's a pretty old OMPI version, and we don't really support it any 
>>>> longer. However, I can provide some advice:
>>>>
>>>> * have you tried running the simple "hello_c" example we provide? This 
>>>> would at least tell you if the problem is in your app, which is what I'd 
>>>> expect given your description
>>>>
>>>> * try using gdb (or pick your debugger) to look at the corefile and see 
>>>> where it is failing
>>>>
>>>> I'd also suggest updating OMPI to the 1.6.5 or 1.8.1 versions, but I doubt 
>>>> that's the issue behind this problem.
>>>>
>>>>
>>>> On Jul 19, 2014, at 1:05 AM, Lane, William <william.l...@cshs.org> wrote:
>>>>
>>>>> I'm getting consistent errors of the form:
>>>>>
>>>>> "mpirun noticed that process rank 3 with PID 802 on node csclprd3-0-8 
>>>>> exited on signal 11 (Segmentation fault)."
>>>>>
>>>>> whenever I request more than 28 slots. These
>>>>> errors even occur when I run mpirun locally
>>>>> on a compute node that has 32 slots (8 cores, 16 with hyperthreading).
>>>>>
>>>>> When I run less than 28 slots I have no problems whatsoever.
>>>>>
>>>>> OS:
>>>>> CentOS release 6.3 (Final)
>>>>>
>>>>> openMPI information:
>>>>>                  Package: Open MPI mockbu...@c6b8.bsys.dev.centos.org 
>>>>> Distribution
>>>>>                 Open MPI: 1.5.4
>>>>>    Open MPI SVN revision: r25060
>>>>>    Open MPI release date: Aug 18, 2011
>>>>>                 Open RTE: 1.5.4
>>>>>    Open RTE SVN revision: r25060
>>>>>    Open RTE release date: Aug 18, 2011
>>>>>                     OPAL: 1.5.4
>>>>>        OPAL SVN revision: r25060
>>>>>        OPAL release date: Aug 18, 2011
>>>>>             Ident string: 1.5.4
>>>>>                   Prefix: /usr/lib64/openmpi
>>>>>  Configured architecture: x86_64-unknown-linux-gnu
>>>>>           Configure host: c6b8.bsys.dev.centos.org
>>>>>            Configured by: mockbuild
>>>>>            Configured on: Fri Jun 22 06:42:03 UTC 2012
>>>>>           Configure host: c6b8.bsys.dev.centos.org
>>>>>                 Built by: mockbuild
>>>>>                 Built on: Fri Jun 22 06:46:48 UTC 2012
>>>>>               Built host: c6b8.bsys.dev.centos.org
>>>>>               C bindings: yes
>>>>>             C++ bindings: yes
>>>>>       Fortran77 bindings: yes (all)
>>>>>       Fortran90 bindings: yes
>>>>>  Fortran90 bindings size: small
>>>>>               C compiler: gcc
>>>>>      C compiler absolute: /usr/bin/gcc
>>>>>   C compiler family name: GNU
>>>>>       C compiler version: 4.4.6
>>>>>             C++ compiler: g++
>>>>>    C++ compiler absolute: /usr/bin/g++
>>>>>       Fortran77 compiler: gfortran
>>>>>   Fortran77 compiler abs: /usr/bin/gfortran
>>>>>       Fortran90 compiler: gfortran
>>>>>   Fortran90 compiler abs: /usr/bin/gfortran
>>>>>              C profiling: yes
>>>>>            C++ profiling: yes
>>>>>      Fortran77 profiling: yes
>>>>>      Fortran90 profiling: yes
>>>>>           C++ exceptions: no
>>>>>           Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no)
>>>>>            Sparse Groups: no
>>>>>   Internal debug support: no
>>>>>   MPI interface warnings: no
>>>>>      MPI parameter check: runtime
>>>>> Memory profiling support: no
>>>>> Memory debugging support: no
>>>>>          libltdl support: yes
>>>>>    Heterogeneous support: no
>>>>>  mpirun default --prefix: no
>>>>>          MPI I/O support: yes
>>>>>        MPI_WTIME support: gettimeofday
>>>>>      Symbol vis. support: yes
>>>>>           MPI extensions: affinity example
>>>>>    FT Checkpoint support: no (checkpoint thread: no)
>>>>>   MPI_MAX_PROCESSOR_NAME: 256
>>>>>     MPI_MAX_ERROR_STRING: 256
>>>>>      MPI_MAX_OBJECT_NAME: 64
>>>>>         MPI_MAX_INFO_KEY: 36
>>>>>         MPI_MAX_INFO_VAL: 256
>>>>>        MPI_MAX_PORT_NAME: 1024
>>>>>   MPI_MAX_DATAREP_STRING: 128
>>>>>            MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>           MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA memory: linux (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA carto: auto_detect (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA carto: file (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA timer: linux (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>          MCA installdirs: env (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>          MCA installdirs: config (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA dpm: orte (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA allocator: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: inter (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: self (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: sync (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: tuned (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: fake (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: bfo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: csum (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: v (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA bml: r2 (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA rcache: vma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: ofud (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: openib (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: self (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: tcp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA topo: unity (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA osc: rdma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: hnp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: orted (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: tool (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA oob: tcp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA odls: default (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ras: cm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ras: gridengine (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                  MCA ras: loadleveler (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                  MCA ras: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: load_balance (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: round_robin (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA rml: oob (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: binomial (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: cm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: direct (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: linear (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: radix (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: slave (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: rsh (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: rshd (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA filem: rsh (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA errmgr: default (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: env (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: hnp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: singleton (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slave (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: tool (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: hier (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>             MCA notifier: command (MCA v2.0, API v1.0, Component v1.5.4)
>>>>>             MCA notifier: smtp (MCA v2.0, API v1.0, Component v1.5.4)
>>>>>             MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.5.4)
>>>>>
>>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>>> entity to which it is addressed and may contain information that is 
>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>> applicable law. If the reader of this message is not the intended 
>>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>>> intended recipient, you are hereby notified that any dissemination, 
>>>>> distribution or copying of this information is STRICTLY PROHIBITED. If 
>>>>> you have received this message in error, please notify us immediately by 
>>>>> calling (310) 423-6428 and destroy the related message. Thank You for 
>>>>> your cooperation. _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/07/24815.php
>>>>
>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>> entity to which it is addressed and may contain information that is 
>>>> privileged and confidential, the disclosure of which is governed by 
>>>> applicable law. If the reader of this message is not the intended 
>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>> intended recipient, you are hereby notified that any dissemination, 
>>>> distribution or copying of this information is STRICTLY PROHIBITED. If you 
>>>> have received this message in error, please notify us immediately by 
>>>> calling (310) 423-6428 and destroy the related message. Thank You for your 
>>>> cooperation. _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/07/24817.php
>>>
>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>> entity to which it is addressed and may contain information that is 
>>> privileged and confidential, the disclosure of which is governed by 
>>> applicable law. If the reader of this message is not the intended 
>>> recipient, or the employee or agent responsible for delivering it to the 
>>> intended recipient, you are hereby notified that any dissemination, 
>>> distribution or copying of this information is STRICTLY PROHIBITED. If you 
>>> have received this message in error, please notify us immediately by 
>>> calling (310) 423-6428 and destroy the related message. Thank You for your 
>>> cooperation. _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/07/24819.php
>>
>> IMPORTANT WARNING: This message is intended for the use of the person or 
>> entity to which it is addressed and may contain information that is 
>> privileged and confidential, the disclosure of which is governed by 
>> applicable law. If the reader of this message is not the intended recipient, 
>> or the employee or agent responsible for delivering it to the intended 
>> recipient, you are hereby notified that any dissemination, distribution or 
>> copying of this information is STRICTLY PROHIBITED. If you have received 
>> this message in error, please notify us immediately by calling (310) 
>> 423-6428 and destroy the related message. Thank You for your cooperation. 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/07/24832.php
>
> IMPORTANT WARNING: This message is intended for the use of the person or 
> entity to which it is addressed and may contain information that is 
> privileged and confidential, the disclosure of which is governed by 
> applicable law. If the reader of this message is not the intended recipient, 
> or the employee or agent responsible for delivering it to the intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of this information is STRICTLY PROHIBITED. If you have received this 
> message in error, please notify us immediately by calling (310) 423-6428 and 
> destroy the related message. Thank You for your cooperation.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/07/24852.php


--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post: 
http://www.open-mpi.org/community/lists/users/2014/07/24853.php
IMPORTANT WARNING: This message is intended for the use of the person or entity 
to which it is addressed and may contain information that is privileged and 
confidential, the disclosure of which is governed by applicable law. If the 
reader of this message is not the intended recipient, or the employee or agent 
responsible for delivering it to the intended recipient, you are hereby 
notified that any dissemination, distribution or copying of this information is 
STRICTLY PROHIBITED. If you have received this message in error, please notify 
us immediately by calling (310) 423-6428 and destroy the related message. Thank 
You for your cooperation.

Reply via email to