Dear Gilles, thanks for your answer.
- compiler: gcc-6.3.0 - OpenMP environment vars: OMP_PROC_BIND=true, GOMP_CPU_AFFINITY not set - hyperthread a given OpenMP thread is on: it's printed in the output below as a 3-digit number after the first ",", read by sched_getcpu() in the OpenMP test code - the migration between cores/hyperthreads should be prevented by OMP_PROC_BIND=true - I didn't find a migration, but the similar use of one core/hyperthread by two OpenMP threads in example "4"/"MPI Instance 0002": 011/031 are both on core #11. Are there any hints how to cleanly transfer the OpenMPI binding to the OpenMP tasks? Thanks and kind regards, Ado On 12.04.2017 15:40, Gilles Gouaillardet wrote: > That should be a two steps tango > - Open MPI bind a MPI task to a socket > - the OpenMP runtime bind OpenMP threads to cores (or hyper threads) inside > the socket assigned by Open MPI > > which compiler are you using ? > do you set some environment variables to direct OpenMP to bind threads ? > > Also, how do you measure the hyperthread a given OpenMP thread is on ? > is it the hyperthread used at a given time ? If yes, then the thread might > migrate unless it was pinned by the OpenMP runtime. > > If you are not sure, please post the source of your program so we can have a > look > > Last but not least, as long as OpenMP threads are pinned to distinct cores, > you should not worry about them migrating between hyperthreads from the same > core. > > Cheers, > > Gilles > > On Wednesday, April 12, 2017, Heinz-Ado Arnolds <arno...@mpa-garching.mpg.de > <mailto:arno...@mpa-garching.mpg.de>> wrote: > > Dear rhc, > > to make it more clear what I try to achieve, I collected some examples > for several combinations of command line options. Would be great if you find > time to look to these below. The most promise one is example "4". > > I'd like to have 4 MPI jobs starting 1 OpenMP job each with 10 threads, > running on 2 nodes, each having 2 sockets, with 10 cores & 10 hwthreads. Only > 10 cores (no hwthreads) should be used on each socket. > > 4 MPI -> 1 OpenMP with 10 thread (i.e. 4x10 threads) > 2 nodes, 2 sockets each, 10 cores & 10 hwthreads each > > 1. mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh" > -report-bindings ./myid > > Machines : > pascal-2-05...DE 20 > pascal-1-03...DE 20 > > [pascal-2-05:28817] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > [pascal-2-05:28817] MCW rank 1 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [pascal-1-03:19256] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > [pascal-1-03:19256] MCW rank 3 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > MPI Instance 0001 of 0004 is on pascal-2-05, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0001(pid > 28833), 018, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0002(pid > 28833), 014, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0003(pid > 28833), 028, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0004(pid > 28833), 012, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0005(pid > 28833), 030, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0006(pid > 28833), 016, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0007(pid > 28833), 038, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0008(pid > 28833), 034, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0009(pid > 28833), 020, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-2-05: MP thread #0010(pid > 28833), 022, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0002 of 0004 is on pascal-2-05, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0001(pid > 28834), 007, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0002(pid > 28834), 037, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0003(pid > 28834), 039, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0004(pid > 28834), 035, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0005(pid > 28834), 031, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0006(pid > 28834), 005, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0007(pid > 28834), 027, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0008(pid > 28834), 017, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0009(pid > 28834), 019, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-2-05: MP thread #0010(pid > 28834), 029, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0003 of 0004 is on pascal-1-03, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0001(pid > 19269), 012, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0002(pid > 19269), 034, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0003(pid > 19269), 008, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0004(pid > 19269), 038, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0005(pid > 19269), 032, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0006(pid > 19269), 036, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0007(pid > 19269), 020, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0008(pid > 19269), 002, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0009(pid > 19269), 004, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0010(pid > 19269), 006, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0004 of 0004 is on pascal-1-03, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0001(pid > 19268), 005, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0002(pid > 19268), 029, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0003(pid > 19268), 015, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0004(pid > 19268), 007, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0005(pid > 19268), 031, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0006(pid > 19268), 013, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0007(pid > 19268), 037, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0008(pid > 19268), 039, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0009(pid > 19268), 021, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0010(pid > 19268), 023, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > > I get a distribution to 4 sockets on 2 nodes as expected, but cores > and corresponding hwthreads are used simultaneously: > MPI Instance 0001 of 0004: MP thread #0001 runs on CPU 018, MP thread > #0007 runs on CPU 038, > MP thread #0002 runs on CPU 014, MP thread > #0008 runs on CPU 034 > according to "lscpu -a -e" CPUs 18/38 resp. 14/34 are the same > physical cores > > 2. mpirun -np 4 --map-by ppr:2:node --use-hwthread-cpus -bind-to hwthread > --mca plm_rsh_agent "qrsh" -report-bindings ./myid > > Machines : > pascal-1-05...DE 20 > pascal-2-05...DE 20 > > I get this warning: > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: pascal-1-05 > > Open MPI uses the "hwloc" library to perform process and memory > binding. This error message means that hwloc has indicated that > processor binding support is not available on this machine. > > On OS X, processor and memory binding is not available at all (i.e., > the OS does not expose this functionality). > > On Linux, lack of the functionality can mean that you are on a > platform where processor and memory affinity is not supported in > Linux > itself, or that hwloc was built without NUMA and/or processor > affinity > support. When building hwloc (which, depending on your Open MPI > installation, may be embedded in Open MPI itself), it is important to > have the libnuma header and library files available. Different linux > distributions package these files under different names; look for > packages with the word "numa" in them. You may also need a developer > version of the package (e.g., with "dev" or "devel" in the name) to > obtain the relevant header files. > > If you are getting this message on a non-OS X, non-Linux platform, > then hwloc does not support processor / memory affinity on this > platform. If the OS/platform does actually support processor / memory > affinity, then you should contact the hwloc maintainers: > https://github.com/open-mpi/hwloc > <https://github.com/open-mpi/hwloc>. > > This is a warning only; your job will continue, though performance > may > be degraded. > > and these results: > > [pascal-1-05:33175] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B./../../../../../../../../..][../../../../../../../../../..] > [pascal-1-05:33175] MCW rank 1 bound to socket 0[core 0[hwt 1]]: > [.B/../../../../../../../../..][../../../../../../../../../..] > [pascal-2-05:28916] MCW rank 2 bound to socket 0[core 0[hwt 0]]: > [B./../../../../../../../../..][../../../../../../../../../..] > [pascal-2-05:28916] MCW rank 3 bound to socket 0[core 0[hwt 1]]: > [.B/../../../../../../../../..][../../../../../../../../../..] > MPI Instance 0001 of 0004 is on pascal-1-05, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0001(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0002(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0003(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0004(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0005(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0006(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0007(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0008(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0009(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-05: MP thread #0010(pid > 33193), 000, Cpus_allowed_list: 0 > MPI Instance 0002 of 0004 is on pascal-1-05, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0001(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0002(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0003(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0004(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0005(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0006(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0007(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0008(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0009(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0002 of 0004 is on pascal-1-05: MP thread #0010(pid > 33192), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-2-05, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0001(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0002(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0003(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0004(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0005(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0006(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0007(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0008(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0009(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0003 of 0004 is on pascal-2-05: MP thread #0010(pid > 28930), 000, Cpus_allowed_list: 0 > MPI Instance 0004 of 0004 is on pascal-2-05, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0001(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0002(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0003(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0004(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0005(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0006(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0007(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0008(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0009(pid > 28929), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-2-05: MP thread #0010(pid > 28929), 020, Cpus_allowed_list: 20 > > Only 2 CPUs are used and these are the same physical cores. > > 3. mpirun -np 4 --use-hwthread-cpus -bind-to hwthread --mca plm_rsh_agent > "qrsh" -report-bindings ./myid > > Machines : > pascal-1-03...DE 20 > pascal-2-02...DE 20 > > I get a warning again: > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: pascal-1-03 > > Open MPI uses the "hwloc" library to perform process and memory > binding. This error message means that hwloc has indicated that > processor binding support is not available on this machine. > > On OS X, processor and memory binding is not available at all (i.e., > the OS does not expose this functionality). > > On Linux, lack of the functionality can mean that you are on a > platform where processor and memory affinity is not supported in > Linux > itself, or that hwloc was built without NUMA and/or processor > affinity > support. When building hwloc (which, depending on your Open MPI > installation, may be embedded in Open MPI itself), it is important to > have the libnuma header and library files available. Different linux > distributions package these files under different names; look for > packages with the word "numa" in them. You may also need a developer > version of the package (e.g., with "dev" or "devel" in the name) to > obtain the relevant header files. > > If you are getting this message on a non-OS X, non-Linux platform, > then hwloc does not support processor / memory affinity on this > platform. If the OS/platform does actually support processor / memory > affinity, then you should contact the hwloc maintainers: > https://github.com/open-mpi/hwloc > <https://github.com/open-mpi/hwloc>. > > This is a warning only; your job will continue, though performance > may > be degraded. > > and these results: > > [pascal-1-03:19345] MCW rank 0 bound to socket 0[core 0[hwt 0]]: > [B./../../../../../../../../..][../../../../../../../../../..] > [pascal-1-03:19345] MCW rank 1 bound to socket 1[core 10[hwt 0]]: > [../../../../../../../../../..][B./../../../../../../../../..] > [pascal-1-03:19345] MCW rank 2 bound to socket 0[core 0[hwt 1]]: > [.B/../../../../../../../../..][../../../../../../../../../..] > [pascal-1-03:19345] MCW rank 3 bound to socket 1[core 10[hwt 1]]: > [../../../../../../../../../..][.B/../../../../../../../../..] > MPI Instance 0001 of 0004 is on pascal-1-03, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0001(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0002(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0003(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0004(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0005(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0006(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0007(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0008(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0009(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0001 of 0004 is on pascal-1-03: MP thread #0010(pid > 19373), 000, Cpus_allowed_list: 0 > MPI Instance 0002 of 0004 is on pascal-1-03, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0001(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0002(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0003(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0004(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0005(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0006(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0007(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0008(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0009(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0002 of 0004 is on pascal-1-03: MP thread #0010(pid > 19372), 001, Cpus_allowed_list: 1 > MPI Instance 0003 of 0004 is on pascal-1-03, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0001(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0002(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0003(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0004(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0005(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0006(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0007(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0008(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0009(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0003 of 0004 is on pascal-1-03: MP thread #0010(pid > 19370), 020, Cpus_allowed_list: 20 > MPI Instance 0004 of 0004 is on pascal-1-03, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0001(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0002(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0003(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0004(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0005(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0006(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0007(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0008(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0009(pid > 19371), 021, Cpus_allowed_list: 21 > MPI Instance 0004 of 0004 is on pascal-1-03: MP thread #0010(pid > 19371), 021, Cpus_allowed_list: 21 > > The jobs are scheduled to one machine only. > > 4. mpirun -np 4 --map-by ppr:2:node --use-hwthread-cpus --mca > plm_rsh_agent "qrsh" -report-bindings ./myid > > Machines : > pascal-1-00...DE 20 > pascal-3-00...DE 20 > > [pascal-1-00:05867] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > [pascal-1-00:05867] MCW rank 1 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > [pascal-3-00:07501] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > [pascal-3-00:07501] MCW rank 3 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > MPI Instance 0001 of 0004 is on pascal-1-00, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0001(pid > 05884), 034, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0002(pid > 05884), 038, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0003(pid > 05884), 002, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0004(pid > 05884), 008, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0005(pid > 05884), 036, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0006(pid > 05884), 000, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0007(pid > 05884), 004, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0008(pid > 05884), 006, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0009(pid > 05884), 030, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0001 of 0004 is on pascal-1-00: MP thread #0010(pid > 05884), 032, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0002 of 0004 is on pascal-1-00, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0001(pid > 05883), 031, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0002(pid > 05883), 017, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0003(pid > 05883), 027, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0004(pid > 05883), 039, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0005(pid > 05883), 011, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0006(pid > 05883), 033, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0007(pid > 05883), 015, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0008(pid > 05883), 021, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0009(pid > 05883), 003, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0002 of 0004 is on pascal-1-00: MP thread #0010(pid > 05883), 025, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0003 of 0004 is on pascal-3-00, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0001(pid > 07513), 016, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0002(pid > 07513), 020, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0003(pid > 07513), 022, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0004(pid > 07513), 018, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0005(pid > 07513), 012, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0006(pid > 07513), 004, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0007(pid > 07513), 008, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0008(pid > 07513), 006, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0009(pid > 07513), 030, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0003 of 0004 is on pascal-3-00: MP thread #0010(pid > 07513), 034, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > MPI Instance 0004 of 0004 is on pascal-3-00, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0001(pid > 07514), 017, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0002(pid > 07514), 025, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0003(pid > 07514), 029, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0004(pid > 07514), 003, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0005(pid > 07514), 033, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0006(pid > 07514), 001, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0007(pid > 07514), 007, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0008(pid > 07514), 039, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0009(pid > 07514), 035, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > MPI Instance 0004 of 0004 is on pascal-3-00: MP thread #0010(pid > 07514), 031, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > > This distribution looks very well with this combination of options > "--map-by ppr:2:node --use-hwthread-cpus", with one exception: looking at > "MPI Instance 0002", you'll find that "MP thread #0001" is executed on CPU > 031, and "MP thread #0005" is executed on CPU 011. 011/031 are the same > physical core. > All others are real perfect! Is this error due to my fault or might > their be a small remaining binding problem in OpenMPI? > > I'd appreciate any hint very much! > > Kind regards, > > Ado > > On 11.04.2017 01:36, r...@open-mpi.org <javascript:;> wrote: > > I’m not entirely sure I understand your reference to “real cores”. When > we bind you to a core, we bind you to all the HT’s that comprise that core. > So, yes, with HT enabled, the binding report will list things by HT, but > you’ll always be bound to the full core if you tell us bind-to core > > > > The default binding directive is bind-to socket when more than 2 > processes are in the job, and that’s what you are showing. You can override > that by adding "-bind-to core" to your cmd line if that is what you desire. > > > > If you want to use individual HTs as independent processors, then > “--use-hwthread-cpus -bind-to hwthreads” would indeed be the right > combination. > > > >> On Apr 10, 2017, at 3:55 AM, Heinz-Ado Arnolds > <arno...@mpa-garching.mpg.de <javascript:;>> wrote: > >> > >> Dear OpenMPI users & developers, > >> > >> I'm trying to distribute my jobs (with SGE) to a machine with a > certain number of nodes, each node having 2 sockets, each socket having 10 > cores & 10 hyperthreads. I like to use only the real cores, no hyperthreading. > >> > >> lscpu -a -e > >> > >> CPU NODE SOCKET CORE L1d:L1i:L2:L3 > >> 0 0 0 0 0:0:0:0 > >> 1 1 1 1 1:1:1:1 > >> 2 0 0 2 2:2:2:0 > >> 3 1 1 3 3:3:3:1 > >> 4 0 0 4 4:4:4:0 > >> 5 1 1 5 5:5:5:1 > >> 6 0 0 6 6:6:6:0 > >> 7 1 1 7 7:7:7:1 > >> 8 0 0 8 8:8:8:0 > >> 9 1 1 9 9:9:9:1 > >> 10 0 0 10 10:10:10:0 > >> 11 1 1 11 11:11:11:1 > >> 12 0 0 12 12:12:12:0 > >> 13 1 1 13 13:13:13:1 > >> 14 0 0 14 14:14:14:0 > >> 15 1 1 15 15:15:15:1 > >> 16 0 0 16 16:16:16:0 > >> 17 1 1 17 17:17:17:1 > >> 18 0 0 18 18:18:18:0 > >> 19 1 1 19 19:19:19:1 > >> 20 0 0 0 0:0:0:0 > >> 21 1 1 1 1:1:1:1 > >> 22 0 0 2 2:2:2:0 > >> 23 1 1 3 3:3:3:1 > >> 24 0 0 4 4:4:4:0 > >> 25 1 1 5 5:5:5:1 > >> 26 0 0 6 6:6:6:0 > >> 27 1 1 7 7:7:7:1 > >> 28 0 0 8 8:8:8:0 > >> 29 1 1 9 9:9:9:1 > >> 30 0 0 10 10:10:10:0 > >> 31 1 1 11 11:11:11:1 > >> 32 0 0 12 12:12:12:0 > >> 33 1 1 13 13:13:13:1 > >> 34 0 0 14 14:14:14:0 > >> 35 1 1 15 15:15:15:1 > >> 36 0 0 16 16:16:16:0 > >> 37 1 1 17 17:17:17:1 > >> 38 0 0 18 18:18:18:0 > >> 39 1 1 19 19:19:19:1 > >> > >> How do I have to choose the options & parameters of mpirun to achieve > this behavior? > >> > >> mpirun -np 4 --map-by ppr:2:node --mca plm_rsh_agent "qrsh" > -report-bindings ./myid > >> > >> distributes to > >> > >> [pascal-1-04:35735] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > >> [pascal-1-04:35735] MCW rank 1 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > >> [pascal-1-03:00787] MCW rank 2 bound to socket 0[core 0[hwt 0-1]], > socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt > 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]], socket 0[core > 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], socket > 0[core 9[hwt 0-1]]: > [BB/BB/BB/BB/BB/BB/BB/BB/BB/BB][../../../../../../../../../..] > >> [pascal-1-03:00787] MCW rank 3 bound to socket 1[core 10[hwt 0-1]], > socket 1[core 11[hwt 0-1]], socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt > 0-1]], socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]], socket 1[core > 16[hwt 0-1]], socket 1[core 17[hwt 0-1]], socket 1[core 18[hwt 0-1]], socket > 1[core 19[hwt 0-1]]: > [../../../../../../../../../..][BB/BB/BB/BB/BB/BB/BB/BB/BB/BB] > >> MPI Instance 0001 of 0004 is on > pascal-1-04,pascal-1-04.MPA-Garching.MPG.DE > <http://pascal-1-04.MPA-Garching.MPG.DE>, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > >> MPI Instance 0002 of 0004 is on > pascal-1-04,pascal-1-04.MPA-Garching.MPG.DE > <http://pascal-1-04.MPA-Garching.MPG.DE>, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > >> MPI Instance 0003 of 0004 is on > pascal-1-03,pascal-1-03.MPA-Garching.MPG.DE > <http://pascal-1-03.MPA-Garching.MPG.DE>, Cpus_allowed_list: > 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38 > >> MPI Instance 0004 of 0004 is on > pascal-1-03,pascal-1-03.MPA-Garching.MPG.DE > <http://pascal-1-03.MPA-Garching.MPG.DE>, Cpus_allowed_list: > 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39 > >> > >> i.e.: 2 nodes: ok, 2 sockets: ok, different set of cores: ok, but uses > all hwthreads > >> > >> I have tried several combinations of --use-hwthread-cpus, --bind-to > hwthreads, but didn't find the right combination. > >> > >> Would be great to get any hints? > >> > >> Thank a lot in advance, > >> > >> Heinz-Ado Arnolds > >> _______________________________________________ > >> users mailing list > >> users@lists.open-mpi.org <javascript:;> > >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > > > _______________________________________________ > > users mailing list > > users@lists.open-mpi.org <javascript:;> > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > <https://rfd.newmexicoconsortium.org/mailman/listinfo/users> > > > > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users >
smime.p7s
Description: S/MIME Cryptographic Signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users