On Dec 2, 2006, at 10:31 AM, Jeff Squyres wrote:
FWIW, especially on NUMA machines (like AMDs), physical access to
network resources (such as NICs / HCAs) can be much faster on
specific sockets.
For example, we recently ran some microbenchmarks showing that if you
run 2 MPI processes across 2 NUMA machines (e.g., a simple ping-pong
benchmark across 2 machines), if you pin the MPI process to socket 0/
core 0, you'll get noticeably better latency. If you don't, the MPI
process may not be consistently located physically close to the NIC/
HCA, resulting in more "jitter" in the delivered latency (or even
worse, consistently worse latency).
I *believe* that this has to do with physical setup within the
machine (i.e., the NIC/HCA bus is physically "closer" to some
sockets), but I'm not much of a hardware guy to know that for sure.
Someone with more specific knowledge should chime in here...
This is true, It is because only a single cpu has a HT connection to
the chipset which then connects to all other devices (NIC, USB,
HD's). All other cpus must send data down its connection to the
cpu with the connection to the chipset. I think though (not sure on
duel core) that all cpus up to 8 way, have connections to all other
cpus. So while a single cpu would have lower latency, all others
should have roughly the same latency.
Personally I have not ran this test, nor do i know how. Have you
tried it yourself? I would like to know this information for our own
systems. (all AMD's)
Brock
On Dec 1, 2006, at 2:13 PM, Greg Lindahl wrote:
On Fri, Dec 01, 2006 at 11:51:24AM +0100, Peter Kjellstrom wrote:
This might be a bit naive but, if you spawn two procs on a dual
core dual
socket system then the linux kernel should automagically schedule
them this
way.
No, we checked this for OpenMP and MPI, and in both cases wiring the
processes to cores was significantly better. The Linux scheduler
(still) tends to migrate processes to the wrong core when OS threads
and processes wake up and go back to sleep.
Just like the OpenMPI guys, we don't have a clever solution for the
"what if the user wants to have 2 OpenMP or MPI jobs share the same
node?" Well, I have a plan, but it's annoying to implement.
-- greg
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Jeff Squyres
Server Virtualization Business Unit
Cisco Systems
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users