Re: [OMPI users] Asymmetric performance with nonblocking, multithreaded communications

Yiannis Papadopoulos Fri, 9 Dec 2011 10:21:12 -0500

Patrik Jonsson wrote:

Hi all,


I'm seeing performance issues I don't understand in my multithreaded
MPI code, and I was hoping someone could shed some light on this.

The code structure is as follows: A computational domain is decomposed
into MPI tasks. Each MPI task has a "master thread" that receives
messages from the other tasks and puts those into a local, concurrent
queue. The tasks then have a few "worker threads" that processes the
incoming messages and when necessary sends them to other tasks. So for
each task, there is one thread doing receives and N (typically number
of cores-1) threads doing sends. All messages are nonblocking, so the
workers just post the sends and continue with computation, and the
master repeatedly does a number of test calls to check for incoming
messages (there are different flavors of these messages so it does
several tests).

When do you do the MPI_Test on the Isends? I have had performance issues in anumber of systems if I would use a single queue of MPI_Requests that would keepIsends to different ranks and testing them one by one. It appears that somemessages are sent out more efficiently if you test them.

I found that either using MPI_Testsome or having a map(key=rank, value=queue ofMPI_Requests) and testing for each key the first MPI_Request, resolved this issue.

Re: [OMPI users] Asymmetric performance with nonblocking, multithreaded communications

Reply via email to