Thanks, Dave.
I have verified the memory locality and IB card locality, all's fine.
Quite accidentally I have found that there is a huge penalty if I mmap
the shm with PROT_READ only. Using PROT_READ | PROT_WRITE yields good
results, although I must look at this further. I'll report when I am
certain, in case sb finds this useful.
Is this an OS feature, or is OpenMPI somehow working differently? I
don't suspect you guys write to the send buffer, right? Even if you
would there would be a segfault. So I guess this could be OS preventing
any writes to the pointer that introduced the overhead?
Marcin
On 09/28/2015 09:44 PM, Dave Goodell (dgoodell) wrote:
On Sep 27, 2015, at 1:38 PM, marcin.krotkiewski <marcin.krotkiew...@gmail.com>
wrote:
Hello, everyone
I am struggling a bit with IB performance when sending data from a POSIX shared
memory region (/dev/shm). The memory is shared among many MPI processes within
the same compute node. Essentially, I see a bit hectic performance, but it
seems that my code it is roughly twice slower than when using a usual, malloced
send buffer.
It may have to do with NUMA effects and the way you're allocating/touching your shared
memory vs. your private (malloced) memory. If you have a multi-NUMA-domain system (i.e.,
any 2+ socket server, and even some single-socket servers) then you are likely to run
into this sort of issue. The PCI bus on which your IB HCA communicates is almost
certainly closer to one NUMA domain than the others, and performance will usually be
worse if you are sending/receiving from/to a "remote" NUMA domain.
"lstopo" and other tools can sometimes help you get a handle on the situation, though I don't
know if it knows how to show memory affinity. I think you can find memory affinity for a process via
"/proc/<pid>/numa_maps". There's lots of info about NUMA affinity here:
https://queue.acm.org/detail.cfm?id=2513149
-Dave
_______________________________________________
users mailing list
us...@open-mpi.org
Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
Link to this post:
http://www.open-mpi.org/community/lists/users/2015/09/27702.php