On Jun 3, 2009, at 10:48 AM, <jacob_liber...@dell.com> wrote:

For HPL, try writing a bash script that pins processes to their local memory controllers using numactl before kicking off HPL. This is particularly helpful when spawning more than 1 thread per process. The last line of your script should look like "numactl -c $cpu_bind -m $ mem_bind $*".

Believe it or not, I hit 94.5% HPL efficiency using this tactic on a 16 node cluster. Using processor affinity (various MPIs) my results were inconsistent and ranged between 88-93%


If you're using multi-threaded HPL, that might be useful. But if you're not, I'd be surprised if you got any different results than Open MPI binding itself. If there really is a difference, we should figure out why. More specifically, calling numactl yourself should be pretty much exactly what we do in OMPI (via API, not via calling numactl).

--
Jeff Squyres
Cisco Systems

Reply via email to