Sure. Here's the basic one: (1159) $ env OMP_NUM_THREADS=7 mpirun -np 4 ./hello-hybrid.x | sort -g -k 18 Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 1 Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 1 Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 1 Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 1 Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 2 Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 3 Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 3 Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 3 Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 3 Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 5 Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 10 Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 13 Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 13 Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 15 Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 15 Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU 16 Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 17 Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU 19 Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 21 Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU 24 Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU 25 Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 26 Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 26 Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 27 Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 27
So, we get some where 4 threads are on a CPU, etc. Using -bind-to core does interesting things: (1162) $ env OMP_NUM_THREADS=7 mpirun -np 4 -bind-to core ./hello-hybrid.x | sort -g -k 18 Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 0 Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 1 Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 14 Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU 15 Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU 15 And if you think it makes sense. I've bound the MPI processes to a core (I'd prefer cores 0, 7, 14, 21 instead of 0,1,14,15. I'm guessing Open MPI can do this?), but then all the OpenMP threads are executing on that core which is not ideal. Note: we are working here on getting SGI omplace/dplace built and installed. I know that works with MPT and Intel MPI, but I'm guessing Open MPI can as well? Then I can follow this: http://www.nas.nasa.gov/hecc/support/kb/using-sgi-omplace-for-pinning_287.html Matt On Wed, Jan 6, 2016 at 2:48 PM, Erik Schnetter <schnet...@gmail.com> wrote: > Setting KMP_AFFINITY will probably override anything that OpenMPI > sets. Can you try without? > > -erik > > On Wed, Jan 6, 2016 at 2:46 PM, Matt Thompson <fort...@gmail.com> wrote: > > Hello Open MPI Gurus, > > > > As I explore MPI-OpenMP hybrid codes, I'm trying to figure out how to do > > things to get the same behavior in various stacks. For example, I have a > > 28-core node (2 14-core Haswells), and I'd like to run 4 MPI processes > and 7 > > OpenMP threads. Thus, I'd like the processes to be 2 processes per socket > > with the OpenMP threads laid out on them. Using a "hybrid Hello World" > > program, I can achieve this with Intel MPI (after a lot of testing): > > > > (1097) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 > > ./hello-hybrid.x | sort -g -k 18 > > srun.slurm: cluster configuration lacks support for cpu binding > > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU 0 > > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU 1 > > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU 2 > > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU 3 > > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU 4 > > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU 5 > > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU 6 > > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU 7 > > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU 8 > > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU 9 > > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU > 10 > > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU > 11 > > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU > 12 > > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU > 13 > > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU > 14 > > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU > 15 > > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU > 16 > > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU > 17 > > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU > 18 > > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU > 19 > > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU > 20 > > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU > 21 > > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU > 22 > > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU > 23 > > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU > 24 > > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU > 25 > > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU > 26 > > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU > 27 > > > > Other than the odd fact that Process #0 seemed to start on Socket #1 > (this > > might be an artifact of how I'm trying to detect the CPU I'm on), this > looks > > reasonable. 14 threads on each socket and each process is laying out its > > threads in a nice orderly fashion. > > > > I'm trying to figure out how to do this with Open MPI (version 1.10.0) > and > > apparently I am just not quite good enough to figure it out. The closest > > I've gotten is: > > > > (1155) $ env OMP_NUM_THREADS=7 KMP_AFFINITY=compact mpirun -np 4 -map-by > > ppr:2:socket ./hello-hybrid.x | sort -g -k 18 > > Hello from thread 0 out of 7 from process 0 out of 4 on borgo035 on CPU 0 > > Hello from thread 0 out of 7 from process 1 out of 4 on borgo035 on CPU 0 > > Hello from thread 1 out of 7 from process 0 out of 4 on borgo035 on CPU 1 > > Hello from thread 1 out of 7 from process 1 out of 4 on borgo035 on CPU 1 > > Hello from thread 2 out of 7 from process 0 out of 4 on borgo035 on CPU 2 > > Hello from thread 2 out of 7 from process 1 out of 4 on borgo035 on CPU 2 > > Hello from thread 3 out of 7 from process 0 out of 4 on borgo035 on CPU 3 > > Hello from thread 3 out of 7 from process 1 out of 4 on borgo035 on CPU 3 > > Hello from thread 4 out of 7 from process 0 out of 4 on borgo035 on CPU 4 > > Hello from thread 4 out of 7 from process 1 out of 4 on borgo035 on CPU 4 > > Hello from thread 5 out of 7 from process 0 out of 4 on borgo035 on CPU 5 > > Hello from thread 5 out of 7 from process 1 out of 4 on borgo035 on CPU 5 > > Hello from thread 6 out of 7 from process 0 out of 4 on borgo035 on CPU 6 > > Hello from thread 6 out of 7 from process 1 out of 4 on borgo035 on CPU 6 > > Hello from thread 0 out of 7 from process 2 out of 4 on borgo035 on CPU > 14 > > Hello from thread 0 out of 7 from process 3 out of 4 on borgo035 on CPU > 14 > > Hello from thread 1 out of 7 from process 2 out of 4 on borgo035 on CPU > 15 > > Hello from thread 1 out of 7 from process 3 out of 4 on borgo035 on CPU > 15 > > Hello from thread 2 out of 7 from process 2 out of 4 on borgo035 on CPU > 16 > > Hello from thread 2 out of 7 from process 3 out of 4 on borgo035 on CPU > 16 > > Hello from thread 3 out of 7 from process 2 out of 4 on borgo035 on CPU > 17 > > Hello from thread 3 out of 7 from process 3 out of 4 on borgo035 on CPU > 17 > > Hello from thread 4 out of 7 from process 2 out of 4 on borgo035 on CPU > 18 > > Hello from thread 4 out of 7 from process 3 out of 4 on borgo035 on CPU > 18 > > Hello from thread 5 out of 7 from process 2 out of 4 on borgo035 on CPU > 19 > > Hello from thread 5 out of 7 from process 3 out of 4 on borgo035 on CPU > 19 > > Hello from thread 6 out of 7 from process 2 out of 4 on borgo035 on CPU > 20 > > Hello from thread 6 out of 7 from process 3 out of 4 on borgo035 on CPU > 20 > > > > Obviously not right. Any ideas on how to help me learn? The man mpirun > page > > is a bit formidable in the pinning part, so maybe I've missed an obvious > > answer. > > > > Matt > > -- > > Matt Thompson > > > > Man Among Men > > Fulcrum of History > > > > > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > > http://www.open-mpi.org/community/lists/users/2016/01/28217.php > > > > -- > Erik Schnetter <schnet...@gmail.com> > http://www.perimeterinstitute.ca/personal/eschnetter/ > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28218.php > -- Matt Thompson Man Among Men Fulcrum of History