Sure. Here's the basic one:

(1159) $ env OMP_NUM_THREADS=7 mpirun -np 4 ./hello-hybrid.x | sort -g -k 18
Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 1
Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 1
Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 1
Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 1
Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 2
Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 3
Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 3
Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 3
Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 3
Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 5
Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 10
Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 13
Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 13
Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 15
Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 15
Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU 16
Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 17
Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU 19
Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 21
Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU 24
Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU 25
Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 26
Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 26
Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 27
Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 27

So, we get some where 4 threads are on a CPU, etc. Using -bind-to core does
interesting things:

(1162) $ env OMP_NUM_THREADS=7 mpirun -np 4 -bind-to core ./hello-hybrid.x
| sort -g -k 18
Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 0
Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 1
Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 14
Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU 15
Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU 15

And if you think it makes sense. I've bound the MPI processes to a core
(I'd prefer cores 0, 7, 14, 21 instead of 0,1,14,15. I'm guessing Open MPI
can do this?), but then all the OpenMP threads are executing on that core
which is not ideal.

Note: we are working here on getting SGI omplace/dplace built and
installed. I know that works with MPT and Intel MPI, but I'm guessing Open
MPI can as well? Then I can follow this:
http://www.nas.nasa.gov/hecc/support/kb/using-sgi-omplace-for-pinning_287.html

Matt


On Wed, Jan 6, 2016 at 2:48 PM, Erik Schnetter <schnet...@gmail.com> wrote:

> Setting KMP_AFFINITY will probably override anything that OpenMPI
> sets. Can you try without?
>
> -erik
>
> On Wed, Jan 6, 2016 at 2:46 PM, Matt Thompson <fort...@gmail.com> wrote:
> > Hello Open MPI Gurus,
> >
> > As I explore MPI-OpenMP hybrid codes, I'm trying to figure out how to do
> > things to get the same behavior in various stacks. For example, I have a
> > 28-core node (2 14-core Haswells), and I'd like to run 4 MPI processes
> and 7
> > OpenMP threads. Thus, I'd like the processes to be 2 processes per socket
> > with the OpenMP threads laid out on them. Using a "hybrid Hello World"
> > program, I can achieve this with Intel MPI (after a lot of testing):
> >
> > (1097) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4
> > ./hello-hybrid.x | sort -g -k 18
> > srun.slurm: cluster configuration lacks support for cpu binding
> > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 0
> > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 1
> > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 2
> > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 3
> > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 4
> > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 5
> > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 6
> > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 7
> > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 8
> > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 9
> > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU
> 10
> > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU
> 11
> > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU
> 12
> > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU
> 13
> > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU
> 14
> > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU
> 15
> > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU
> 16
> > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU
> 17
> > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU
> 18
> > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU
> 19
> > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU
> 20
> > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU
> 21
> > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU
> 22
> > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU
> 23
> > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU
> 24
> > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU
> 25
> > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU
> 26
> > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU
> 27
> >
> > Other than the odd fact that Process #0 seemed to start on Socket #1
> (this
> > might be an artifact of how I'm trying to detect the CPU I'm on), this
> looks
> > reasonable. 14 threads on each socket and each process is laying out its
> > threads in a nice orderly fashion.
> >
> > I'm trying to figure out how to do this with Open MPI (version 1.10.0)
> and
> > apparently I am just not quite good enough to figure it out. The closest
> > I've gotten is:
> >
> > (1155) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 -map-by
> > ppr:2:socket ./hello-hybrid.x | sort -g -k 18
> > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 0
> > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 0
> > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 1
> > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 1
> > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 2
> > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 2
> > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 3
> > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 3
> > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 4
> > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 4
> > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 5
> > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 5
> > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 6
> > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 6
> > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU
> 14
> > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU
> 14
> > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU
> 15
> > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU
> 15
> > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU
> 16
> > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU
> 16
> > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU
> 17
> > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU
> 17
> > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU
> 18
> > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU
> 18
> > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU
> 19
> > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU
> 19
> > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU
> 20
> > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU
> 20
> >
> > Obviously not right. Any ideas on how to help me learn? The man mpirun
> page
> > is a bit formidable in the pinning part, so maybe I've missed an obvious
> > answer.
> >
> > Matt
> > --
> > Matt Thompson
> >
> > Man Among Men
> > Fulcrum of History
> >
> >
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> > Link to this post:
> > http://www.open-mpi.org/community/lists/users/2016/01/28217.php
>
>
>
> --
> Erik Schnetter <schnet...@gmail.com>
> http://www.perimeterinstitute.ca/personal/eschnetter/
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/01/28218.php
>



-- 
Matt Thompson

Man Among Men
Fulcrum of History

Reply via email to