On Mon, Jun 8, 2009 at 11:07 PM, Lars Andersson<lars...@gmail.com> wrote: > I'd say that your own workaround here is to intersperse MPI_TEST's > periodically. This will trigger OMPI's pipelined protocol for large > messages, and should allow partial bursts of progress while you're > assumedly off doing useful work. If this is difficult because the > work is being done in library code that you can't change, then perhaps > a pre-spawned "work" through could be used to call MPI_TEST > periodically. That way, it won't steal huge ammounts of CPU cycles > (like MPI_WAIT would). You still might get some cache thrashing, > context switching, etc. -- YMMV.
Thanks Jeff, it's good to hear that this is a valid workaround. I've done a few small experiments, and by calling MPI_Test in a while loop with an usleep(1000) I'm able to get almost full bandwidth for large messages with less than 5% CPU utilization. /Lars