Thanks for sending that lstopo output - helped clarify things for me. I think I now understand the issue. Mostly a problem of my being rather dense when reading your earlier note.
Try using —map-by node:PE=N to your cmd line. I think the problem is that we default to —map-by numa if you just give cpus-per-proc and no mapping directive as we know that having threads that span multiple numa regions is bad for performance > On Dec 5, 2014, at 9:07 AM, John Bray <jb...@allinea.com> wrote: > > Hi Ralph > > I have a motherboard with 2 X6580 chips, each with 6 cores 2 way > hyperthreading, so /proc/cpuinfo reports 24 cores > > Doing a pure compute OpenMP loop where I'd expect the number of iterations in > 10s to rise with number of threads > with gnu and mpich > OMP_NUM_THREADS=1 -n 1 : 112 iterations > OMP_NUM_THREADS=2 -n 1 : 224 iterations > OMP_NUM_THREADS=6 -n 1 : 644 iterations > OMP_NUM_THREADS=12 -n 1 : 1287 iterations > OMP_NUM_THREADS=22 -n 1 : 1182 iterations > OMP_NUM_THREADS=24 -n 1 : 454 iterations > > which shows that mpich is spreading across the cores, but hyperthreading is > not useful, and using the whole node counterproductive > > with gnu and openmpi 1.8.3 > OMP_NUM_THREADS=1 mpiexec -n 1 : 112 > OMP_NUM_THREADS=2 mpiexec -n 1 : 113 > which suggests you aren't allowing the threads to spread across cores > > adding --cpus-per-node I gain access to the resources on one chip > > OMP_NUM_THREADS=1 mpiexec --cpus-per-proc 1 -n 1 : 112 > OMP_NUM_THREADS=2 mpiexec --cpus-per-proc 2 -n 1 : 224 > OMP_NUM_THREADS=6 mpiexec --cpus-per-proc 2 -n 1 : 644 > then > OMP_NUM_THREADS=12 mpiexec --cpus-per-proc 12 -n 1 > > A request for multiple cpus-per-proc was given, but a directive > was also give to map to an object level that has less cpus than > requested ones: > > #cpus-per-proc: 12 > number of cpus: 6 > map-by: BYNUMA > > So you aren't happy using both chips for one process > > OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 1 --use-hwthread-cpus : 112 > OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112 > OMP_NUM_THREADS=4 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224 > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 6 --use-hwthread-cpus : 324 > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 631 > OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus : 647 > > OMP_NUM_THREADS=24 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus > > A request for multiple cpus-per-proc was given, but a directive > was also give to map to an object level that has less cpus than > requested ones: > > #cpus-per-proc: 24 > number of cpus: 12 > map-by: BYNUMA > > OMP_NUM_THREADS=1 mpiexec -n 1 --cpus-per-proc 2 --use-hwthread-cpus : 112 > OMP_NUM_THREADS=2 mpiexec -n 1 --cpus-per-proc 4 --use-hwthread-cpus : 224 > OMP_NUM_THREADS=6 mpiexec -n 1 --cpus-per-proc 12 --use-hwthread-cpus :: 644 > > OMP_NUM_THREADS=12 mpiexec -n 1 --cpus-per-proc 24 --use-hwthread-cpus :: 644 > > A request for multiple cpus-per-proc was given, but a directive > was also give to map to an object level that has less cpus than > requested ones: > > #cpus-per-proc: 24 > number of cpus: 12 > map-by: BYNUMA > > So it seems that --use-hwthread-cpus means that --cpus-per-proc changes from > physical cores to hyperthreaded cores, but I can't get both chips working on > the problem in way mpich can > > John > > > > > > > _______________________________________________ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/12/25919.php