Le 28/09/2015 21:44, Dave Goodell (dgoodell) a écrit : > It may have to do with NUMA effects and the way you're allocating/touching > your shared memory vs. your private (malloced) memory. If you have a > multi-NUMA-domain system (i.e., any 2+ socket server, and even some > single-socket servers) then you are likely to run into this sort of issue. > The PCI bus on which your IB HCA communicates is almost certainly closer to > one NUMA domain than the others, and performance will usually be worse if you > are sending/receiving from/to a "remote" NUMA domain. > > "lstopo" and other tools can sometimes help you get a handle on the > situation, though I don't know if it knows how to show memory affinity.
So, you'd like "lstopo --ps" or "hwloc-ps" for displaying memory binding and/or memory location instead of CPU binding? Shouldn't be too hard. Brice > I think you can find memory affinity for a process via > "/proc/<pid>/numa_maps". There's lots of info about NUMA affinity here: > https://queue.acm.org/detail.cfm?id=2513149 >