Hyperthreading is pretty great for non-HPC applications, which is why Intel 
makes it.  But hyperthreading *generally* does not help HPC application 
performance.  You're basically halving several on-chip resources / queues / 
pipelines, and that can hurt for performance-hungry HPC applications.

This is a per-application issue, of course, so YMMV.  But the general wisdom -- 
even with Intel Ivy Bridge-class chips -- is to disable hperthreading for HPC 
apps.  

That being said, Open MPI started supporting hyperthreading properly somewhere 
in the 1.5/1.6 series (I don't remember the exact version).  These are among 
the reasons that we're urging you to upgrade to at least 1.6.5.  "Supporting 
hyperthreading properly" means: when you say "bind to core", OMPI will 
recognize that each core is composed of N hyperthreads, and will bind to all of 
them (vs. bind each MPI process to a Linux virtual processor, which may be a 
core or a hyperthread).

So if you're running in a bind-to-core situation, if it's a "before OMPI 
supporter HT properly" version, then you'll bind 2 MPI processes to a single 
core, and that will likely be pretty terrible for overall performance.

Does that help?


On Jul 22, 2014, at 5:18 PM, Lane, William <william.l...@cshs.org> wrote:

> Ralph,
> 
> The 32 slot systems/nodes I'm running my openMPI test code on only have
> 16 physical cores, the rest of the slots are hyperthreads. I've done some more
> testing and noticed that if I limit the number of slots per node to 8
> (via -npernode 8) everything works and 8 slots are used from each system/node:
> 
> mpirun -np 32 -npernode 8 --prefix /usr/lib64/openmpi --hostfile hostfile 
> --mca btl_tcp_if_include eth0 --mca 
>     pls_gridengine_verbose 1 /hpc/home/lanew/mpi/openmpi/ProcessColors2
> 
> However, when I run this same testcode on an older cluster (much older version
> of openMPI [1.3.3]) I have no problems using all the cores (including 
> hyperthreading
> cores). The Intel CPU's used are different in each case: the older cluster 
> uses
> two, 6 core Xeons with 12 hyperthreads while the new cluster uses two,
> 8 core Sandybridges with 16 hyperthreads apiece.
> 
> Is hyperthreading an issue with openMPI? Should hyperthreading always be 
> turned
> off for openMPI apps?
> 
> Thanks for you time,
> 
> -Bill Lane
> 
> 
> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
> [r...@open-mpi.org]
> Sent: Tuesday, July 22, 2014 7:57 AM
> To: Open MPI Users
> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
> 
> Hmmm...that's not a "bug", but just a packaging issue with the way CentOS 
> distributed some variants of OMPI that requires you install/update things in 
> a specific order.
> 
> On Jul 20, 2014, at 11:34 PM, Lane, William <william.l...@cshs.org> wrote:
> 
>> Please see:
>> 
>> http://bugs.centos.org/view.php?id=5812
>> 
>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>> [r...@open-mpi.org]
>> Sent: Sunday, July 20, 2014 9:30 AM
>> To: Open MPI Users
>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>> 
>> I'm unaware of any CentOS-OMPI bug, and I've been using CentOS throughout 
>> the 6.x series running OMPI 1.6.x and above.
>> 
>> I can't speak to the older versions of CentOS and/or the older versions of 
>> OMPI.
>> 
>> On Jul 19, 2014, at 8:14 PM, Lane, William <william.l...@cshs.org> wrote:
>> 
>>> Yes there is a second HPC Sun Grid Engine cluster on which I've run
>>> this openMPI test code dozens of times on upwards of 400 slots
>>> through SGE using qsub and qrsh, but this was using a much
>>> older version of openMPI (1.3.3 I believe). On that particular cluster the
>>> open files hard and soft limits were an issue.
>>> 
>>> I have noticed that there has been a new (as of July 2014) CentOS openMPI 
>>> bug that
>>> occurs when CentOS is upgraded from 6.2 to 6.3. I'm not sure if that
>>> bug applies to this situation though.
>>> 
>>> This particular problem occurs whether or not I submit jobs through SGE 
>>> (via qrsh
>>> or qsub) or outside of SGE which leads me to believe it is an openMPI 
>>> and/or CentOS
>>> issue.
>>> 
>>> -Bill Lane
>>> 
>>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>>> [r...@open-mpi.org]
>>> Sent: Saturday, July 19, 2014 3:21 PM
>>> To: Open MPI Users
>>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>>> 
>>> Not for this test case size. You should be just fine with the default 
>>> values.
>>> 
>>> If I understand you correctly, you've run this app at scale before on 
>>> another cluster without problem?
>>> 
>>> On Jul 19, 2014, at 1:34 PM, Lane, William <william.l...@cshs.org> wrote:
>>> 
>>>> Ralph,
>>>> 
>>>> It's hard to imagine it's the openMPI code because I've tested this code
>>>> extensively on another cluster with 400 nodes and never had any problems.
>>>> But I'll try using the hello_c example in any case. Is it still 
>>>> recommended to
>>>> raise the open files soft and hard limits to 4096? Or should even larger 
>>>> values
>>>> be necessary?
>>>> 
>>>> Thank you for your help.
>>>> 
>>>> -Bill Lane
>>>> 
>>>> From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain 
>>>> [r...@open-mpi.org]
>>>> Sent: Saturday, July 19, 2014 8:07 AM
>>>> To: Open MPI Users
>>>> Subject: Re: [OMPI users] Mpirun 1.5.4 problems when request > 28 slots
>>>> 
>>>> That's a pretty old OMPI version, and we don't really support it any 
>>>> longer. However, I can provide some advice:
>>>> 
>>>> * have you tried running the simple "hello_c" example we provide? This 
>>>> would at least tell you if the problem is in your app, which is what I'd 
>>>> expect given your description
>>>> 
>>>> * try using gdb (or pick your debugger) to look at the corefile and see 
>>>> where it is failing
>>>> 
>>>> I'd also suggest updating OMPI to the 1.6.5 or 1.8.1 versions, but I doubt 
>>>> that's the issue behind this problem.
>>>> 
>>>> 
>>>> On Jul 19, 2014, at 1:05 AM, Lane, William <william.l...@cshs.org> wrote:
>>>> 
>>>>> I'm getting consistent errors of the form:
>>>>> 
>>>>> "mpirun noticed that process rank 3 with PID 802 on node csclprd3-0-8 
>>>>> exited on signal 11 (Segmentation fault)."
>>>>> 
>>>>> whenever I request more than 28 slots. These
>>>>> errors even occur when I run mpirun locally
>>>>> on a compute node that has 32 slots (8 cores, 16 with hyperthreading).
>>>>> 
>>>>> When I run less than 28 slots I have no problems whatsoever.
>>>>> 
>>>>> OS: 
>>>>> CentOS release 6.3 (Final)
>>>>> 
>>>>> openMPI information:
>>>>>                  Package: Open MPI mockbu...@c6b8.bsys.dev.centos.org 
>>>>> Distribution
>>>>>                 Open MPI: 1.5.4
>>>>>    Open MPI SVN revision: r25060
>>>>>    Open MPI release date: Aug 18, 2011
>>>>>                 Open RTE: 1.5.4
>>>>>    Open RTE SVN revision: r25060
>>>>>    Open RTE release date: Aug 18, 2011
>>>>>                     OPAL: 1.5.4
>>>>>        OPAL SVN revision: r25060
>>>>>        OPAL release date: Aug 18, 2011
>>>>>             Ident string: 1.5.4
>>>>>                   Prefix: /usr/lib64/openmpi
>>>>>  Configured architecture: x86_64-unknown-linux-gnu
>>>>>           Configure host: c6b8.bsys.dev.centos.org
>>>>>            Configured by: mockbuild
>>>>>            Configured on: Fri Jun 22 06:42:03 UTC 2012
>>>>>           Configure host: c6b8.bsys.dev.centos.org
>>>>>                 Built by: mockbuild
>>>>>                 Built on: Fri Jun 22 06:46:48 UTC 2012
>>>>>               Built host: c6b8.bsys.dev.centos.org
>>>>>               C bindings: yes
>>>>>             C++ bindings: yes
>>>>>       Fortran77 bindings: yes (all)
>>>>>       Fortran90 bindings: yes
>>>>>  Fortran90 bindings size: small
>>>>>               C compiler: gcc
>>>>>      C compiler absolute: /usr/bin/gcc
>>>>>   C compiler family name: GNU
>>>>>       C compiler version: 4.4.6
>>>>>             C++ compiler: g++
>>>>>    C++ compiler absolute: /usr/bin/g++
>>>>>       Fortran77 compiler: gfortran
>>>>>   Fortran77 compiler abs: /usr/bin/gfortran
>>>>>       Fortran90 compiler: gfortran
>>>>>   Fortran90 compiler abs: /usr/bin/gfortran
>>>>>              C profiling: yes
>>>>>            C++ profiling: yes
>>>>>      Fortran77 profiling: yes
>>>>>      Fortran90 profiling: yes
>>>>>           C++ exceptions: no
>>>>>           Thread support: posix (MPI_THREAD_MULTIPLE: no, progress: no)
>>>>>            Sparse Groups: no
>>>>>   Internal debug support: no
>>>>>   MPI interface warnings: no
>>>>>      MPI parameter check: runtime
>>>>> Memory profiling support: no
>>>>> Memory debugging support: no
>>>>>          libltdl support: yes
>>>>>    Heterogeneous support: no
>>>>>  mpirun default --prefix: no
>>>>>          MPI I/O support: yes
>>>>>        MPI_WTIME support: gettimeofday
>>>>>      Symbol vis. support: yes
>>>>>           MPI extensions: affinity example
>>>>>    FT Checkpoint support: no (checkpoint thread: no)
>>>>>   MPI_MAX_PROCESSOR_NAME: 256
>>>>>     MPI_MAX_ERROR_STRING: 256
>>>>>      MPI_MAX_OBJECT_NAME: 64
>>>>>         MPI_MAX_INFO_KEY: 36
>>>>>         MPI_MAX_INFO_VAL: 256
>>>>>        MPI_MAX_PORT_NAME: 1024
>>>>>   MPI_MAX_DATAREP_STRING: 128
>>>>>            MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>           MCA memchecker: valgrind (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA memory: linux (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA paffinity: hwloc (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA carto: auto_detect (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA carto: file (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA maffinity: libnuma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA timer: linux (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>          MCA installdirs: env (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>          MCA installdirs: config (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA dpm: orte (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA allocator: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>            MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: inter (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: self (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: sync (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA coll: tuned (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: fake (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: rdma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA mpool: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: bfo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: csum (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: ob1 (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA pml: v (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA bml: r2 (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA rcache: vma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: ofud (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: openib (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: self (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: sm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA btl: tcp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA topo: unity (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA osc: pt2pt (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA osc: rdma (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: hnp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: orted (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA iof: tool (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA oob: tcp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                 MCA odls: default (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ras: cm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ras: gridengine (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                  MCA ras: loadleveler (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                  MCA ras: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: load_balance (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA rmaps: rank_file (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: resilient (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: round_robin (MCA v2.0, API v2.0, Component 
>>>>> v1.5.4)
>>>>>                MCA rmaps: seq (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA rmaps: topo (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA rml: oob (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: binomial (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: cm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: direct (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: linear (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: radix (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA routed: slave (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: rsh (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: rshd (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA plm: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                MCA filem: rsh (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>               MCA errmgr: default (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: env (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: hnp (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: singleton (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slave (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slurm (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: slurmd (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>                  MCA ess: tool (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: bad (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: basic (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>              MCA grpcomm: hier (MCA v2.0, API v2.0, Component v1.5.4)
>>>>>             MCA notifier: command (MCA v2.0, API v1.0, Component v1.5.4)
>>>>>             MCA notifier: smtp (MCA v2.0, API v1.0, Component v1.5.4)
>>>>>             MCA notifier: syslog (MCA v2.0, API v1.0, Component v1.5.4)
>>>>> 
>>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>>> entity to which it is addressed and may contain information that is 
>>>>> privileged and confidential, the disclosure of which is governed by 
>>>>> applicable law. If the reader of this message is not the intended 
>>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>>> intended recipient, you are hereby notified that any dissemination, 
>>>>> distribution or copying of this information is STRICTLY PROHIBITED. If 
>>>>> you have received this message in error, please notify us immediately by 
>>>>> calling (310) 423-6428 and destroy the related message. Thank You for 
>>>>> your cooperation. _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> Link to this post: 
>>>>> http://www.open-mpi.org/community/lists/users/2014/07/24815.php
>>>> 
>>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>>> entity to which it is addressed and may contain information that is 
>>>> privileged and confidential, the disclosure of which is governed by 
>>>> applicable law. If the reader of this message is not the intended 
>>>> recipient, or the employee or agent responsible for delivering it to the 
>>>> intended recipient, you are hereby notified that any dissemination, 
>>>> distribution or copying of this information is STRICTLY PROHIBITED. If you 
>>>> have received this message in error, please notify us immediately by 
>>>> calling (310) 423-6428 and destroy the related message. Thank You for your 
>>>> cooperation. _______________________________________________
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> Link to this post: 
>>>> http://www.open-mpi.org/community/lists/users/2014/07/24817.php
>>> 
>>> IMPORTANT WARNING: This message is intended for the use of the person or 
>>> entity to which it is addressed and may contain information that is 
>>> privileged and confidential, the disclosure of which is governed by 
>>> applicable law. If the reader of this message is not the intended 
>>> recipient, or the employee or agent responsible for delivering it to the 
>>> intended recipient, you are hereby notified that any dissemination, 
>>> distribution or copying of this information is STRICTLY PROHIBITED. If you 
>>> have received this message in error, please notify us immediately by 
>>> calling (310) 423-6428 and destroy the related message. Thank You for your 
>>> cooperation. _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post: 
>>> http://www.open-mpi.org/community/lists/users/2014/07/24819.php
>> 
>> IMPORTANT WARNING: This message is intended for the use of the person or 
>> entity to which it is addressed and may contain information that is 
>> privileged and confidential, the disclosure of which is governed by 
>> applicable law. If the reader of this message is not the intended recipient, 
>> or the employee or agent responsible for delivering it to the intended 
>> recipient, you are hereby notified that any dissemination, distribution or 
>> copying of this information is STRICTLY PROHIBITED. If you have received 
>> this message in error, please notify us immediately by calling (310) 
>> 423-6428 and destroy the related message. Thank You for your cooperation. 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: 
>> http://www.open-mpi.org/community/lists/users/2014/07/24832.php
> 
> IMPORTANT WARNING: This message is intended for the use of the person or 
> entity to which it is addressed and may contain information that is 
> privileged and confidential, the disclosure of which is governed by 
> applicable law. If the reader of this message is not the intended recipient, 
> or the employee or agent responsible for delivering it to the intended 
> recipient, you are hereby notified that any dissemination, distribution or 
> copying of this information is STRICTLY PROHIBITED. If you have received this 
> message in error, please notify us immediately by calling (310) 423-6428 and 
> destroy the related message. Thank You for your cooperation.
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/07/24852.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to