Wirawan Purwanto <wiraw...@gmail.com> writes: > Instead of the scenario above, I was trying to get the MPI processes > side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill > node 0 first, then fill node 1, and so on. How do I do this properly? > > I tried a few attempts that fail: > > $ export OMP_NUM_THREADS=2 > $ mpirun -np 16 -map-by core:PE=2 ./EXECUTABLE
... > Clearly I am not understanding how this map-by works. Could somebody > help me? There was a wiki article partially written: > > https://github.com/open-mpi/ompi/wiki/ProcessPlacement > > but unfortunately it is also not clear to me. Me neither; this stuff has traditionally been quite unclear and really needs documenting/explaining properly. This sort of thing from my local instructions for OMPI 1.8 probably does what you want for OMP_NUM_THREADS=2 (where the qrsh options just get me a couple of small nodes): $ qrsh -pe mpi 24 -l num_proc=12 \ mpirun -n 12 --map-by slot:PE=2 --bind-to core --report-bindings true |& sort -k 4 -n [comp544:03093] MCW rank 0 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B/./././.][./././././.] [comp544:03093] MCW rank 1 bound to socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]]: [././B/B/./.][./././././.] [comp544:03093] MCW rank 2 bound to socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [././././B/B][./././././.] [comp544:03093] MCW rank 3 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]]: [./././././.][B/B/./././.] [comp544:03093] MCW rank 4 bound to socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]]: [./././././.][././B/B/./.] [comp544:03093] MCW rank 5 bound to socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][././././B/B] [comp527:03056] MCW rank 6 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]]: [B/B/./././.][./././././.] [comp527:03056] MCW rank 7 bound to socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]]: [././B/B/./.][./././././.] [comp527:03056] MCW rank 8 bound to socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [././././B/B][./././././.] [comp527:03056] MCW rank 9 bound to socket 1[core 6[hwt 0]], socket 1[core 7[hwt 0]]: [./././././.][B/B/./././.] [comp527:03056] MCW rank 10 bound to socket 1[core 8[hwt 0]], socket 1[core 9[hwt 0]]: [./././././.][././B/B/./.] [comp527:03056] MCW rank 11 bound to socket 1[core 10[hwt 0]], socket 1[core 11[hwt 0]]: [./././././.][././././B/B] I don't remember how I found that out. _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users