"The only way to get any benefit from the MPI_Bsend is to have a progress thread which take care of the pending communications in the background. Such thread is not enabled by default in Open MPI."
I understand this won't be portable, but how do you enable a sender progress thread in Open MPI?