Thanks for sending that lstopo output - helped clarify things for me. I think I 
now understand the issue. Mostly a problem of my being rather dense when 
reading your earlier note.

Try using —map-by node:PE=N to your cmd line. I think the problem is that we 
default to —map-by numa if you just give cpus-per-proc and no mapping directive 
as we know that having threads that span multiple numa regions is bad for 
performance


> On Dec 5, 2014, at 9:07 AM, John Bray <jb...@allinea.com> wrote:
> 
> Hi Ralph
> 
> I have a motherboard with 2 X6580 chips, each with 6 cores 2 way 
> hyperthreading, so /proc/cpuinfo reports 24 cores
> 
> Doing a pure compute OpenMP loop where I'd expect the number of iterations in 
> 10s to rise with number of threads
> with gnu and mpich
> OMP_NUM_THREADS=1 -n 1 : 112 iterations
> OMP_NUM_THREADS=2 -n 1 : 224 iterations
> OMP_NUM_THREADS=6 -n 1 : 644 iterations
> OMP_NUM_THREADS=12 -n 1 : 1287 iterations
> OMP_NUM_THREADS=22 -n 1 : 1182 iterations
> OMP_NUM_THREADS=24 -n 1 : 454 iterations
> 
> which shows that mpich is spreading across the cores, but hyperthreading is 
> not useful, and using the whole node counterproductive
> 
> with gnu and openmpi 1.8.3
> OMP_NUM_THREADS=1 mpiexec -n 1 : 112
> OMP_NUM_THREADS=2 mpiexec -n 1 : 113
> which suggests you aren't allowing the threads to spread across cores
> 
> adding --cpus-per-node I gain access to the resources on one chip
> 
> OMP_NUM_THREADS=1 mpiexec --cpus-per-proc 1 -n 1 : 112
> OMP_NUM_THREADS=2 mpiexec --cpus-per-proc 2 -n 1 : 224
> OMP_NUM_THREADS=6 mpiexec --cpus-per-proc 2 -n 1 : 644
> then
> OMP_NUM_THREADS=12 mpiexec --cpus-per-proc 12 -n 1
> 
> A request for multiple cpus-per-proc was given, but a directive
> was also give to map to an object level that has less cpus than
> requested ones:
> 
>   #cpus-per-proc:  12
>   number of cpus:  6
>   map-by:          BYNUMA
> 
> So you aren't happy using both chips for one process
> 
> OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 1 --use-hwthread-cpus : 112
> OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112
> OMP_NUM_THREADS=4 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224
> OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 6 --use-hwthread-cpus : 324
> OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 631
> OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 647
> 
> OMP_NUM_THREADS=24 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus 
> 
> A request for multiple cpus-per-proc was given, but a directive
> was also give to map to an object level that has less cpus than
> requested ones:
> 
>   #cpus-per-proc:  24
>   number of cpus:  12
>   map-by:          BYNUMA
> 
> OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112
> OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224
> OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus :: 644
> 
> OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 24 --use-hwthread-cpus :: 644
> 
> A request for multiple cpus-per-proc was given, but a directive
> was also give to map to an object level that has less cpus than
> requested ones:
> 
>   #cpus-per-proc:  24
>   number of cpus:  12
>   map-by:          BYNUMA
> 
> So it seems that --use-hwthread-cpus means that --cpus-per-proc changes from 
> physical cores to hyperthreaded cores, but I can't get both chips working on 
> the problem in way mpich can
> 
> John
> 
> 
> 
> 
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25919.php

Reply via email to