FWIW: the socket option seems to work fine for me:

$ mpirun -n 12 -map-by socket:pe=2 -host rhc001 --report-bindings hostname
[rhc001:200408] MCW rank 1 bound to socket 1[core 12[hwt 0-1]], socket 1[core 
13[hwt 0-1]]: 
[../../../../../../../../../../../..][BB/BB/../../../../../../../../../..]
[rhc001:200408] MCW rank 2 bound to socket 0[core 2[hwt 0-1]], socket 0[core 
3[hwt 0-1]]: 
[../../BB/BB/../../../../../../../..][../../../../../../../../../../../..]
[rhc001:200408] MCW rank 3 bound to socket 1[core 14[hwt 0-1]], socket 1[core 
15[hwt 0-1]]: 
[../../../../../../../../../../../..][../../BB/BB/../../../../../../../..]
[rhc001:200408] MCW rank 4 bound to socket 0[core 4[hwt 0-1]], socket 0[core 
5[hwt 0-1]]: 
[../../../../BB/BB/../../../../../..][../../../../../../../../../../../..]
[rhc001:200408] MCW rank 5 bound to socket 1[core 16[hwt 0-1]], socket 1[core 
17[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../BB/BB/../../../../../..]
[rhc001:200408] MCW rank 6 bound to socket 0[core 6[hwt 0-1]], socket 0[core 
7[hwt 0-1]]: 
[../../../../../../BB/BB/../../../..][../../../../../../../../../../../..]
[rhc001:200408] MCW rank 7 bound to socket 1[core 18[hwt 0-1]], socket 1[core 
19[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../BB/BB/../../../..]
[rhc001:200408] MCW rank 8 bound to socket 0[core 8[hwt 0-1]], socket 0[core 
9[hwt 0-1]]: 
[../../../../../../../../BB/BB/../..][../../../../../../../../../../../..]
[rhc001:200408] MCW rank 9 bound to socket 1[core 20[hwt 0-1]], socket 1[core 
21[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../BB/BB/../..]
[rhc001:200408] MCW rank 10 bound to socket 0[core 10[hwt 0-1]], socket 0[core 
11[hwt 0-1]]: 
[../../../../../../../../../../BB/BB][../../../../../../../../../../../..]
[rhc001:200408] MCW rank 11 bound to socket 1[core 22[hwt 0-1]], socket 1[core 
23[hwt 0-1]]: 
[../../../../../../../../../../../..][../../../../../../../../../../BB/BB]
[rhc001:200408] MCW rank 0 bound to socket 0[core 0[hwt 0-1]], socket 0[core 
1[hwt 0-1]]: 
[BB/BB/../../../../../../../../../..][../../../../../../../../../../../..]
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
rhc001
$

I know that isn’t the pattern you are seeking - will have to ponder that one a 
bit. Is it possible that mpirun is not sitting on the same topology as your 
compute nodes?


> On Oct 3, 2016, at 2:22 PM, Wirawan Purwanto <wiraw...@gmail.com> wrote:
> 
> Hi,
> 
> I have been trying to understand how to correctly launch hybrid
> MPI/OpenMP (i.e. multi-threaded MPI jobs) with mpirun. I am quite
> puzzled as to what is the correct command-line options to use. The
> description on mpirun man page is very confusing and I could not get
> what I wanted.
> 
> A background: The cluster is using SGE, and I am using OpenMPI 1.10.2
> compiled with & for gcc 4.9.3. The MPI library was configured with SGE
> support. The compute nodes have 32 cores, which are basically 2
> sockets of Xeon E5-2698 v3 (16-core Haswell).
> 
> A colleague told me the following:
> 
> $ export OMP_NUM_THREADS=2
> $ mpirun -np 16 -map-by node:PE=2 ./EXECUTABLE
> 
> I could see the executable using 200% of CPU per process--that's good.
> There is one catch in the general case. "-map-by node" will assign the
> MPI processes in a round-robin fashion (so MPI rank 0 gets node 0, mpi
> rank 1 gets node 1, and so on until all nodes are given 1 process,
> then it will go back to node 0,1, ...).
> 
> Instead of the scenario above, I was trying to get the MPI processes
> side-by-side (more like "fill_up" policy in SGE scheduler), i.e. fill
> node 0 first, then fill node 1, and so on. How do I do this properly?
> 
> I tried a few attempts that fail:
> 
> $ export OMP_NUM_THREADS=2
> $ mpirun -np 16 -map-by core:PE=2 ./EXECUTABLE
> 
> or
> 
> $ export OMP_NUM_THREADS=2
> $ mpirun -np 16 -map-by socket:PE=2 ./EXECUTABLE
> 
> Both failed with an error mesage:
> 
> --------------------------------------------------------------------------
> A request for multiple cpus-per-proc was given, but a directive
> was also give to map to an object level that cannot support that
> directive.
> 
> Please specify a mapping level that has more than one cpu, or
> else let us define a default mapping that will allow multiple
> cpus-per-proc.
> --------------------------------------------------------------------------
> 
> Another attempt was:
> 
> $ export OMP_NUM_THREADS=2
> $ mpirun -np 16 -map-by socket:PE=2 -bind-to socket ./EXECUTABLE
> 
> Here's the error message:
> 
> --------------------------------------------------------------------------
> A request for multiple cpus-per-proc was given, but a conflicting binding
> policy was specified:
> 
>  #cpus-per-proc:  2
>  type of cpus:    cores as cpus
>  binding policy given: SOCKET
> 
> The correct binding policy for the given type of cpu is:
> 
>  correct binding policy:  bind-to core
> 
> This is the binding policy we would apply by default for this
> situation, so no binding need be specified. Please correct the
> situation and try again.
> --------------------------------------------------------------------------
> 
> Clearly I am not understanding how this map-by works. Could somebody
> help me? There was a wiki article partially written:
> 
> https://github.com/open-mpi/ompi/wiki/ProcessPlacement
> 
> but unfortunately it is also not clear to me.
> 
> -- 
> Wirawan Purwanto
> Computational Scientist, HPC Group
> Information Technology Services
> Old Dominion University
> Norfolk, VA 23529
> _______________________________________________
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users

_______________________________________________
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Reply via email to