Hi Jeff,

Jeff Squyres <jsquy...@cisco.com> wrote:
I *believe* that this has to do with physical setup within the machine (i.e., the NIC/HCA bus is physically "closer" to some sockets), but I'm not much of a hardware guy to know that for sure. Someone with more specific knowledge should chime in here...

On NUMA architectures, most common being Opteron, the South Bridge is connected through an HT link to one CPU on one socket. Which socket depends on the motherboard, but it should be described in the motherboard documentation (it's not always socket 0). If a process on the other socket needs to write something to a NIC on a PCIE bus behind the South Bridge, it needs to first hop through the first socket. This hop cost usually something like 100ns, ie 0.1 us. If the socket is further away, like in a 4 or 8-socket configuration, there would potentially be more hops.

However, having the processes getting bumped from one socket to another is more expensive in terms of cache locality (with all of the cache coherency overhead that comes with the lack of it) than it terms of HT routing.

Non-NUMA architectures like Intel Woodcrest have a flat access time to the South Bridge, but cache locality is still important so CPU affinity is always a good thing to do.

Patrick
--
Patrick Geoffray
Myricom, Inc.
http://www.myri.com

Reply via email to