Ralph, I added one of the newer LGA2011 nodes to my hostfile and re-ran the benchmark successfully and saw some strange results WRT the binding directives. Why are hyperthreading cores being used on the LGA2011 system but not any of other systems which are mostly hyperthreaded Westmeres)? Isn't the --use-hwthread-cpus switch supposed to prevent OpenMPI from using hyperthreaded cores?
OpenMPI LAPACK invocation: $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-single --mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN Where NSLOTS=72 hostfile: csclprd3-6-1 slots=4 max-slots=4 csclprd3-6-5 slots=4 max-slots=4 csclprd3-0-0 slots=12 max-slots=24 csclprd3-0-1 slots=6 max-slots=12 csclprd3-0-2 slots=6 max-slots=12 csclprd3-0-3 slots=6 max-slots=12 csclprd3-0-4 slots=6 max-slots=12 csclprd3-0-5 slots=6 max-slots=12 csclprd3-0-6 slots=6 max-slots=12 #total number of successfully tested non-hyperthreaded computes slots at this point is 56 csclprd3-0-7 slots=16 max-slots=32 LGA1366 Westmere w/two Intel Xeon X5675 6-core/12-hyperthread CPU's [csclprd3-0-0:11848] MCW rank 11 bound to socket 1[core 7[hwt 0]]: [./././././.][./B/./././.] [csclprd3-0-0:11848] MCW rank 12 bound to socket 0[core 2[hwt 0]]: [././B/././.][./././././.] [csclprd3-0-0:11848] MCW rank 13 bound to socket 1[core 8[hwt 0]]: [./././././.][././B/././.] [csclprd3-0-0:11848] MCW rank 14 bound to socket 0[core 3[hwt 0]]: [./././B/./.][./././././.] [csclprd3-0-0:11848] MCW rank 15 bound to socket 1[core 9[hwt 0]]: [./././././.][./././B/./.] [csclprd3-0-0:11848] MCW rank 16 bound to socket 0[core 4[hwt 0]]: [././././B/.][./././././.] [csclprd3-0-0:11848] MCW rank 17 bound to socket 1[core 10[hwt 0]]: [./././././.][././././B/.] [csclprd3-0-0:11848] MCW rank 18 bound to socket 0[core 5[hwt 0]]: [./././././B][./././././.] [csclprd3-0-0:11848] MCW rank 19 bound to socket 1[core 11[hwt 0]]: [./././././.][./././././B] [csclprd3-0-0:11848] MCW rank 8 bound to socket 0[core 0[hwt 0]]: [B/././././.][./././././.] [csclprd3-0-0:11848] MCW rank 9 bound to socket 1[core 6[hwt 0]]: [./././././.][B/././././.] [csclprd3-0-0:11848] MCW rank 10 bound to socket 0[core 1[hwt 0]]: [./B/./././.][./././././.] but for the LGA2011 system w/two 8-core/16-hyperthread CPU's [csclprd3-0-7:30876] MCW rank 60 bound to socket 0[core 2[hwt 0-1]]: [../../BB/../../../../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 61 bound to socket 1[core 10[hwt 0-1]]: [../../../../../../../..][../../BB/../../../../..] [csclprd3-0-7:30876] MCW rank 62 bound to socket 0[core 3[hwt 0-1]]: [../../../BB/../../../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 63 bound to socket 1[core 11[hwt 0-1]]: [../../../../../../../..][../../../BB/../../../..] [csclprd3-0-7:30876] MCW rank 64 bound to socket 0[core 4[hwt 0-1]]: [../../../../BB/../../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 65 bound to socket 1[core 12[hwt 0-1]]: [../../../../../../../..][../../../../BB/../../..] [csclprd3-0-7:30876] MCW rank 66 bound to socket 0[core 5[hwt 0-1]]: [../../../../../BB/../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 67 bound to socket 1[core 13[hwt 0-1]]: [../../../../../../../..][../../../../../BB/../..] [csclprd3-0-7:30876] MCW rank 68 bound to socket 0[core 6[hwt 0-1]]: [../../../../../../BB/..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 69 bound to socket 1[core 14[hwt 0-1]]: [../../../../../../../..][../../../../../../BB/..] [csclprd3-0-7:30876] MCW rank 70 bound to socket 0[core 7[hwt 0-1]]: [../../../../../../../BB][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 71 bound to socket 1[core 15[hwt 0-1]]: [../../../../../../../..][../../../../../../../BB] [csclprd3-0-7:30876] MCW rank 56 bound to socket 0[core 0[hwt 0-1]]: [BB/../../../../../../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 57 bound to socket 1[core 8[hwt 0-1]]: [../../../../../../../..][BB/../../../../../../..] [csclprd3-0-7:30876] MCW rank 58 bound to socket 0[core 1[hwt 0-1]]: [../BB/../../../../../..][../../../../../../../..] [csclprd3-0-7:30876] MCW rank 59 bound to socket 1[core 9[hwt 0-1]]: [../../../../../../../..][../BB/../../../../../..] ________________________________ From: users [users-boun...@open-mpi.org] on behalf of Ralph Castain [r...@open-mpi.org] Sent: Wednesday, April 08, 2015 10:26 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 On Apr 8, 2015, at 9:29 AM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Ralph, Thanks for YOUR help, I never would've managed to get the LAPACK benchmark running on more than one node in our cluster without your help. Ralph, is hyperthreading more of a curse than an advantage for HPC applications? Wow, you’ll get a lot of argument over that issue! From what I can see, it is very application dependent. Some apps appear to benefit, while others can even suffer from it. I think we should support a mix of nodes in this usage, so I’ll try to come up with a way to do so. I'm going to go through all the OpenMPI articles on hyperthreading and NUMA to see if that will shed any light on these issues. -Bill L. ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Tuesday, April 07, 2015 7:32 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 I’m not sure our man pages are good enough to answer your question, but here is the URL http://www.open-mpi.org/doc/v1.8/ I’m a tad tied up right now, but I’ll try to address this prior to 1.8.5 release. Thanks for all that debug effort! Helps a bunch. On Apr 7, 2015, at 1:17 PM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Ralph, I've finally had some luck using the following: $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-single --mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN Where $NSLOTS was 56 and my hostfile hostfile-single is: csclprd3-0-0 slots=12 max-slots=24 csclprd3-0-1 slots=6 max-slots=12 csclprd3-0-2 slots=6 max-slots=12 csclprd3-0-3 slots=6 max-slots=12 csclprd3-0-4 slots=6 max-slots=12 csclprd3-0-5 slots=6 max-slots=12 csclprd3-0-6 slots=6 max-slots=12 csclprd3-6-1 slots=4 max-slots=4 csclprd3-6-5 slots=4 max-slots=4 The max-slots differs from slots on some nodes because I include the hyperthreaded cores in the max-slots, the last two nodes have CPU's that don't support hyperthreading at all. Does --use-hwthread-cpus prevent slots from being assigned to hyperthreading cores? For some reason the manpage for OpenMPI 1.8.2 isn't installed on our CentOS 6.3 systems is there a URL I can I find a copy of the manpages for OpenMPI 1.8.2? Thanks for your help, -Bill Lane ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Monday, April 06, 2015 1:39 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 Hmmm…well, that shouldn’t be the issue. To check, try running it with “bind-to none”. If you can get a backtrace telling us where it is crashing, that would also help. On Apr 6, 2015, at 12:24 PM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Ralph, For the following two different commandline invocations of the LAPACK benchmark $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes --use-hwthread-cpus --bind-to hwthread --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN $MPI_DIR/bin/mpirun -np $NSLOTS --report-bindings --hostfile hostfile-no_slots --mca btl_tcp_if_include eth0 --hetero-nodes --bind-to-core --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN I'm receiving the same kinds of OpenMPI error messages (but for different nodes in the ring): [csclprd3-0-16:25940] *** Process received signal *** [csclprd3-0-16:25940] Signal: Bus error (7) [csclprd3-0-16:25940] Signal code: Non-existant physical address (2) [csclprd3-0-16:25940] Failing at address: 0x7f8b1b5a2600 -------------------------------------------------------------------------- mpirun noticed that process rank 82 with PID 25936 on node csclprd3-0-16 exited on signal 7 (Bus error). -------------------------------------------------------------------------- 16 total processes killed (some possibly by mpirun during cleanup) It seems to occur on systems that have more than one, physical CPU installed. Could this be due to a lack of the correct NUMA libraries being installed? -Bill L. ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Sunday, April 05, 2015 6:09 PM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 On Apr 5, 2015, at 5:58 PM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: I think some of the Intel Blade systems in the cluster are dual core, but don't support hyperthreading. Maybe it would be better to exclude hyperthreading altogether from submitted OpenMPI jobs? Yes - or you can add "--hetero-nodes -use-hwthread-cpus --bind-to hwthread" to the cmd line. This tells mpirun that the nodes aren't all the same, and so it has to look at each node's topology instead of taking the first node as the template for everything. The second tells it to use the HTs as independent cpus where they are supported. I'm not entirely sure the suggestion will work - if we hit a place where HT isn't supported, we may balk at being asked to bind to HTs. I can probably make a change that supports this kind of hetero arrangement (perhaps something like bind-to pu) - might make it into 1.8.5 (we are just starting the release process on it now). OpenMPI doesn't crash, but it doesn't run the LAPACK benchmark either. Thanks again Ralph. Bill L. ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Wednesday, April 01, 2015 8:40 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 Bingo - you said the magic word. This is a terminology issue. When we say "core", we mean the old definition of "core", not "hyperthreads". If you want to use HTs as your base processing unit and bind to them, then you need to specify --bind-to hwthread. That warning should then go away. We don't require a swap region be mounted - I didn't see anything in your original message indicating that OMPI had actually crashed, but just wasn't launching due to the above issue. Were you actually seeing crashes as well? On Wed, Apr 1, 2015 at 8:31 AM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: Ralph, Here's the associated hostfile: #openMPI hostfile for csclprd3 #max slots prevents oversubscribing csclprd3-0-9 csclprd3-0-0 slots=12 max-slots=12 csclprd3-0-1 slots=6 max-slots=6 csclprd3-0-2 slots=6 max-slots=6 csclprd3-0-3 slots=6 max-slots=6 csclprd3-0-4 slots=6 max-slots=6 csclprd3-0-5 slots=6 max-slots=6 csclprd3-0-6 slots=6 max-slots=6 csclprd3-0-7 slots=32 max-slots=32 csclprd3-0-8 slots=32 max-slots=32 csclprd3-0-9 slots=32 max-slots=32 csclprd3-0-10 slots=32 max-slots=32 csclprd3-0-11 slots=32 max-slots=32 csclprd3-0-12 slots=12 max-slots=12 csclprd3-0-13 slots=24 max-slots=24 csclprd3-0-14 slots=16 max-slots=16 csclprd3-0-15 slots=16 max-slots=16 csclprd3-0-16 slots=24 max-slots=24 csclprd3-0-17 slots=24 max-slots=24 csclprd3-6-1 slots=4 max-slots=4 csclprd3-6-5 slots=4 max-slots=4 The number of slots also includes hyperthreading cores. One more question, would not having defined swap partitions on all the nodes in the ring cause OpenMPI to crash? Because no swap partitions are defined for any of the above systems. -Bill L. ________________________________ From: users [users-boun...@open-mpi.org<mailto:users-boun...@open-mpi.org>] on behalf of Ralph Castain [r...@open-mpi.org<mailto:r...@open-mpi.org>] Sent: Wednesday, April 01, 2015 5:04 AM To: Open MPI Users Subject: Re: [OMPI users] OpenMPI 1.8.2 problems on CentOS 6.3 The warning about binding to memory is due to not having numactl-devel installed on the system. The job would still run, but we are warning you that we cannot bind memory to the same domain as the core where we bind the process. Can cause poor performance, but not fatal. I forget the name of the param, but you can tell us to "shut up" :-) The other warning/error indicates that we aren't seeing enough cores on the allocation you gave us via the hostile to support one proc/core - i.e., we didn't at least 128 cores in the sum of the nodes you told us about. I take it you were expecting that there were that many or more? Ralph On Wed, Apr 1, 2015 at 12:54 AM, Lane, William <william.l...@cshs.org<mailto:william.l...@cshs.org>> wrote: I'm having problems running OpenMPI jobs (using a hostfile) on an HPC cluster running ROCKS on CentOS 6.3. I'm running OpenMPI outside of Sun Grid Engine (i.e. it is not submitted as a job to SGE). The program being run is a LAPACK benchmark. The commandline parameter I'm using to run the jobs is: $MPI_DIR/bin/mpirun -np $NSLOTS -bind-to-core -report-bindings --hostfile hostfile --mca btl_tcp_if_include eth0 --prefix $MPI_DIR $BENCH_DIR/$APP_DIR/$APP_BIN Where MPI_DIR=/hpc/apps/mpi/openmpi/1.8.2/ NSLOTS=128 I'm getting errors of the form and OpenMPI never runs the LAPACK benchmark: -------------------------------------------------------------------------- WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: csclprd3-0-11 This usually is due to not having the required NUMA support installed on the node. In some Linux distributions, the required support is contained in the libnumactl and libnumactl-devel packages. This is a warning only; your job will continue, though performance may be degraded. -------------------------------------------------------------------------- -------------------------------------------------------------------------- A request was made to bind to that would result in binding more processes than cpus on a resource: Bind to: CORE Node: csclprd3-0-11 #processes: 2 #cpus: 1 You can override this protection by adding the "overload-allowed" option to your binding directive. -------------------------------------------------------------------------- The only installed numa packages are: numactl.x86_64 2.0.7-3.el6 @centos6.3-x86_64-0/$ When I search for the available NUMA packages I find: yum search numa | less Loaded plugins: fastestmirror Loading mirror speeds from cached hostfile ============================== N/S Matched: numa =============================== numactl-devel.i686 : Development package for building Applications that use numa numactl-devel.x86_64 : Development package for building Applications that use : numa numad.x86_64 : NUMA user daemon numactl.i686 : Library for tuning for Non Uniform Memory Access machines numactl.x86_64 : Library for tuning for Non Uniform Memory Access machines Do I need to install additional and/or different NUMA packages in order to get OpenMPI to work on this cluster? -Bill Lane IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Searchable archives: http://www.open-mpi.org/community/lists/users/2015/04/index.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/04/26589.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/04/26611.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/04/26618.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/04/26643.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation. _______________________________________________ users mailing list us...@open-mpi.org<mailto:us...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: http://www.open-mpi.org/community/lists/users/2015/04/26655.php IMPORTANT WARNING: This message is intended for the use of the person or entity to which it is addressed and may contain information that is privileged and confidential, the disclosure of which is governed by applicable law. If the reader of this message is not the intended recipient, or the employee or agent responsible for delivering it to the intended recipient, you are hereby notified that any dissemination, distribution or copying of this information is strictly prohibited. Thank you for your cooperation.