FWIW: the socket option seems to work fine for me: $ mpirun -n 12 -map-by socket:pe=2 -host rhc001 --report-bindings hostname [rhc001:200408] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 13[hwt 0-1]]: [../../../../../../../../../../../..][BB/BB/../../../../../../../../../..] [rhc001:200408] MCW rank 2 bound to socket 0[core 2[hwt 0-1]], socket 0[core 3[hwt 0-1]]: [../../BB/BB/../../../../../../../..][../../../../../../../../../../../..] [rhc001:200408] MCW rank 3 bound to socket 1[core 14[hwt 0-1]], socket 1[core 15[hwt 0-1]]: [../../../../../../../../../../../..][../../BB/BB/../../../../../../../..] [rhc001:200408] MCW rank 4 bound to socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 0-1]]: [../../../../BB/BB/../../../../../..][../../../../../../../../../../../..] [rhc001:200408] MCW rank 5 bound to socket 1[core 16[hwt 0-1]], socket 1[core 17[hwt 0-1]]: [../../../../../../../../../../../..][../../../../BB/BB/../../../../../..] [rhc001:200408] MCW rank 6 bound to socket 0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]]: [../../../../../../BB/BB/../../../..][../../../../../../../../../../../..] [rhc001:200408] MCW rank 7 bound to socket 1[core 18[hwt 0-1]], socket 1[core 19[hwt 0-1]]: [../../../../../../../../../../../..][../../../../../../BB/BB/../../../..] [rhc001:200408] MCW rank 8 bound to socket 0[core 8[hwt 0-1]], socket 0[core 9[hwt 0-1]]: [../../../../../../../../BB/BB/../..][../../../../../../../../../../../..] [rhc001:200408] MCW rank 9 bound to socket 1[core 20[hwt 0-1]], socket 1[core 21[hwt 0-1]]: [../../../../../../../../../../../..][../../../../../../../../BB/BB/../..] [rhc001:200408] MCW rank 10 bound to socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 0-1]]: [../../../../../../../../../../BB/BB][../../../../../../../../../../../..] [rhc001:200408] MCW rank 11 bound to socket 1[core 22[hwt 0-1]], socket 1[core 23[hwt 0-1]]: [../../../../../../../../../../../..][../../../../../../../../../../BB/BB] [rhc001:200408] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]]: [BB/BB/../../../../../../../../../..][../../../../../../../../../../../..] rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 rhc001 $
I know that isn’t the pattern you are seeking - will have to ponder that one a bit. Is it possible that mpirun is not sitting on the same topology as your compute nodes? > On Oct 3, 2016, at 2:22 PM, Wirawan Purwanto <wiraw...@gmail.com> wrote: > > Hi, > > I have been trying to understand how to correctly launch hybrid > MPI/OpenMP (i.e. multi-threaded MPI jobs) with mpirun. I am quite > puzzled as to what is the correct command-line options to use. The > description on mpirun man page is very confusing and I could not get > what I wanted. > > A background: The cluster is using SGE, and I am using OpenMPI 1.10.2 > compiled with & for gcc 4.9.3. The MPI library was configured with SGE > support. The compute nodes have 32 cores, which are basically 2 > sockets of Xeon E5-2698 v3 (16-core Haswell). > > A colleague told me the following: > > $ export OMP_NUM_THREADS=2 > $ mpirun -np 16 -map-by node:PE=2 ./EXECUTABLE > > I could see the executable using 200% of CPU per process--that's good. > There is one catch in the general case. "-map-by node" will assign the > MPI processes in a round-robin fashion (so MPI rank 0 gets node 0, mpi > rank 1 gets node 1, and so on until all nodes are given 1 process, > then it will go back to node 0,1, ...). > > Instead of the scenario above, I was trying to get the MPI processes > side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill > node 0 first, then fill node 1, and so on. How do I do this properly? > > I tried a few attempts that fail: > > $ export OMP_NUM_THREADS=2 > $ mpirun -np 16 -map-by core:PE=2 ./EXECUTABLE > > or > > $ export OMP_NUM_THREADS=2 > $ mpirun -np 16 -map-by socket:PE=2 ./EXECUTABLE > > Both failed with an error mesage: > > -------------------------------------------------------------------------- > A request for multiple cpus-per-proc was given, but a directive > was also give to map to an object level that cannot support that > directive. > > Please specify a mapping level that has more than one cpu, or > else let us define a default mapping that will allow multiple > cpus-per-proc. > -------------------------------------------------------------------------- > > Another attempt was: > > $ export OMP_NUM_THREADS=2 > $ mpirun -np 16 -map-by socket:PE=2 -bind-to socket ./EXECUTABLE > > Here's the error message: > > -------------------------------------------------------------------------- > A request for multiple cpus-per-proc was given, but a conflicting binding > policy was specified: > > #cpus-per-proc: 2 > type of cpus: cores as cpus > binding policy given: SOCKET > > The correct binding policy for the given type of cpu is: > > correct binding policy: bind-to core > > This is the binding policy we would apply by default for this > situation, so no binding need be specified. Please correct the > situation and try again. > -------------------------------------------------------------------------- > > Clearly I am not understanding how this map-by works. Could somebody > help me? There was a wiki article partially written: > > https://github.com/open-mpi/ompi/wiki/ProcessPlacement > > but unfortunately it is also not clear to me. > > -- > Wirawan Purwanto > Computational Scientist, HPC Group > Information Technology Services > Old Dominion University > Norfolk, VA 23529 > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users _______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users