From the mpirun man page: ****************** Open MPI employs a three-phase procedure for assigning process locations and ranks: mapping Assigns a default location to each process ranking Assigns an MPI_COMM_WORLD rank value to each process binding Constrains each process to run on specific processors The mapping step is used to assign a default location to each process based on the mapper being employed. Mapping by slot, node, and sequentially results in the assignment of the processes to the node level. In contrast, mapping by object, allows the mapper to assign the process to an actual object on each node.
Note: the location assigned to the process is independent of where it will be bound - the assignment is used solely as input to the binding algorithm. The mapping of process processes to nodes can be defined not just with general policies but also, if necessary, using arbitrary mappings that cannot be described by a simple policy. One can use the "sequential mapper," which reads the hostfile line by line, assigning processes to nodes in whatever order the hostfile specifies. Use the -mca rmaps seq option. For example, using the same hostfile as before: mpirun -hostfile myhostfile -mca rmaps seq ./a.out will launch three processes, one on each of nodes aa, bb, and cc, respectively. The slot counts don’t matter; one process is launched per line on whatever node is listed on the line. Another way to specify arbitrary mappings is with a rankfile, which gives you detailed control over process binding as well. Rankfiles are discussed below. The second phase focuses on the ranking of the process within the job’s MPI_COMM_WORLD. Open MPI separates this from the mapping procedure to allow more flexibility in the relative placement of MPI processes. The binding phase actually binds each process to a given set of processors. This can improve performance if the operating system is placing processes suboptimally. For example, it might oversubscribe some multi-core processor sockets, leaving other sockets idle; this can lead processes to contend unnecessarily for common resources. Or, it might spread processes out too widely; this can be suboptimal if application performance is sensitive to interprocess communication costs. Binding can also keep the operating system from migrating processes excessively, regardless of how optimally those processes were placed to begin with. ******************** So what you probably want is: --map-by socket:pe=N --rank-by core Remember, the pe=N modifier automatically forces binding at the cpu level. The rank-by directive defaults to rank-by socket when you map-by socket, hence you need to specify that you want it to map by core instead. Here is the result of doing that on my box: $ mpirun --map-by socket:pe=2 --rank-by core --report-bindings -n 8 hostname [rhc001:154283] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..] [rhc001:154283] MCW rank 1 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..] [rhc001:154283] MCW rank 2 bound to socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]]: [../../../../BB/BB/../../../../../..][../../../../../../../../../../../..] [rhc001:154283] MCW rank 3 bound to socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [../../../../../../BB/BB/../../../..][../../../../../../../../../../../..] [rhc001:154283] MCW rank 4 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..] [rhc001:154283] MCW rank 5 bound to socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../../../../../..][../../BB/BB/../../../../../../../..] [rhc001:154283] MCW rank 6 bound to socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]]: [../../../../../../../../../../../..][../../../../BB/BB/../../../../../..] [rhc001:154283] MCW rank 7 bound to socket 1[core 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../../../..][../../../../../../BB/BB/../../../..] HTH Ralph > On Feb 23, 2017, at 6:18 AM, <gil...@rist.or.jp> <gil...@rist.or.jp> wrote: > > Mark, > > what about > mpirun -np 6 -map-by slot:PE=4 --bind-to core --report-bindings ./prog > > it is a fit for 1) and 2) but not 3) > > if you use OpenMP and want 2 threads per task, then you can > export OMP_NUM_THREADS=2 > not to use 4 threads by default (with most OpenMP runtimes) > > Cheers, > > Gilles > ----- Original Message ----- >> Hi, >> >> I'm still trying to figure out how to express the core binding I want > to >> openmpi 2.x via the --map-by option. Can anyone help, please? >> >> I bet I'm being dumb, but it's proving tricky to achieve the following >> aims (most important first): >> >> 1) Maximise memory bandwidth usage (e.g. load balance ranks across >> processor sockets) >> 2) Optimise for nearest-neighbour comms (in MPI_COMM_WORLD) (e.g. put >> neighbouring ranks on the same socket) >> 3) Have an incantation that's simple to change based on number of > ranks >> and processes per rank I want. >> >> Example: >> >> Considering a 2 socket, 12 cores/socket box and a program with 2 > threads >> per rank... >> >> ... this is great if I fully-populate the node: >> >> $ mpirun -np 12 -map-by slot:PE=2 --bind-to core --report-bindings ./ > prog >> [somehost:101235] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket > 0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.] >> [somehost:101235] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket > 0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.] >> [somehost:101235] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket > 0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.] >> [somehost:101235] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket > 0[core 7[hwt 0]]: [././././././B/B/./././.][./././././././././././.] >> [somehost:101235] MCW rank 4 bound to socket 0[core 8[hwt 0]], socket > 0[core 9[hwt 0]]: [././././././././B/B/./.][./././././././././././.] >> [somehost:101235] MCW rank 5 bound to socket 0[core 10[hwt 0]], socket > 0[core 11[hwt 0]]: [././././././././././B/B][./././././././././././.] >> [somehost:101235] MCW rank 6 bound to socket 1[core 12[hwt 0]], socket > 1[core 13[hwt 0]]: [./././././././././././.][B/B/./././././././././.] >> [somehost:101235] MCW rank 7 bound to socket 1[core 14[hwt 0]], socket > 1[core 15[hwt 0]]: [./././././././././././.][././B/B/./././././././.] >> [somehost:101235] MCW rank 8 bound to socket 1[core 16[hwt 0]], socket > 1[core 17[hwt 0]]: [./././././././././././.][././././B/B/./././././.] >> [somehost:101235] MCW rank 9 bound to socket 1[core 18[hwt 0]], socket > 1[core 19[hwt 0]]: [./././././././././././.][././././././B/B/./././.] >> [somehost:101235] MCW rank 10 bound to socket 1[core 20[hwt 0]], > socket 1[core 21[hwt 0]]: [./././././././././././.][././././././././B/B/. > /.] >> [somehost:101235] MCW rank 11 bound to socket 1[core 22[hwt 0]], > socket 1[core 23[hwt 0]]: [./././././././././././.][././././././././././ > B/B] >> >> >> ... but not if I don't [fails aim (1)]: >> >> $ mpirun -np 6 -map-by slot:PE=2 --bind-to core --report-bindings ./ > prog >> [somehost:102035] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket > 0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.] >> [somehost:102035] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket > 0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.] >> [somehost:102035] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket > 0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.] >> [somehost:102035] MCW rank 3 bound to socket 0[core 6[hwt 0]], socket > 0[core 7[hwt 0]]: [././././././B/B/./././.][./././././././././././.] >> [somehost:102035] MCW rank 4 bound to socket 0[core 8[hwt 0]], socket > 0[core 9[hwt 0]]: [././././././././B/B/./.][./././././././././././.] >> [somehost:102035] MCW rank 5 bound to socket 0[core 10[hwt 0]], socket > 0[core 11[hwt 0]]: [././././././././././B/B][./././././././././././.] >> >> >> ... whereas if I map by socket instead of slot, I achieve aim (1) but >> fail on aim (2): >> >> $ mpirun -np 6 -map-by socket:PE=2 --bind-to core --report-bindings ./ > prog >> [somehost:105601] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket > 0[core 1[hwt 0]]: [B/B/./././././././././.][./././././././././././.] >> [somehost:105601] MCW rank 1 bound to socket 1[core 12[hwt 0]], socket > 1[core 13[hwt 0]]: [./././././././././././.][B/B/./././././././././.] >> [somehost:105601] MCW rank 2 bound to socket 0[core 2[hwt 0]], socket > 0[core 3[hwt 0]]: [././B/B/./././././././.][./././././././././././.] >> [somehost:105601] MCW rank 3 bound to socket 1[core 14[hwt 0]], socket > 1[core 15[hwt 0]]: [./././././././././././.][././B/B/./././././././.] >> [somehost:105601] MCW rank 4 bound to socket 0[core 4[hwt 0]], socket > 0[core 5[hwt 0]]: [././././B/B/./././././.][./././././././././././.] >> [somehost:105601] MCW rank 5 bound to socket 1[core 16[hwt 0]], socket > 1[core 17[hwt 0]]: [./././././././././././.][././././B/B/./././././.] >> >> >> Any ideas, please? >> >> Thanks, >> >> Mark >> _______________________________________________ >> users mailing list >> users@lists.open-mpi.org >> https://rfd.newmexicoconsortium.org/mailman/listinfo/users >> > > > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users