On Jun 3, 2009, at 10:48 AM, <jacob_liber...@dell.com> wrote:
For HPL, try writing a bash script that pins processes to their
local memory controllers using numactl before kicking off HPL. This
is particularly helpful when spawning more than 1 thread per
process. The last line of your script should look like "numactl -c
$cpu_bind -m $ mem_bind $*".
Believe it or not, I hit 94.5% HPL efficiency using this tactic on a
16 node cluster. Using processor affinity (various MPIs) my results
were inconsistent and ranged between 88-93%
If you're using multi-threaded HPL, that might be useful. But if
you're not, I'd be surprised if you got any different results than
Open MPI binding itself. If there really is a difference, we should
figure out why. More specifically, calling numactl yourself should be
pretty much exactly what we do in OMPI (via API, not via calling
numactl).
--
Jeff Squyres
Cisco Systems