>> Are you overwhelming the receiver with short, unexpected messages such
that MPI keeps mallocing >> and mallocing and mallocing in an attempt to
eagerly receive all the messages?  I ask because Open >> MPI only eagerly
sends short messages -- long messages are queued up at the sender and not
>> actually transferred until the receiver starts to receive (aka a
"rendezvous protocol").

This is probably what is happening. In general, my processes send a massive
number of short messages and it is overwhelming the receivers. As I have
some stage of computation (processes) much slower than others, the first
ones can not handle the incoming messages in the same rate they are
delivered to them.

>> Are you sure that you don't have some other kind of memory error in your
application?

I have checked and there are not memory problems within the application.

>> FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that
it won't complete >> until the receiver has started to receive the message.
 This may slow your sender down dramatically, >> however.  If it slows down
your sender too much, you may have to implement your own flow control.

MPI_SSEND worked for my application and I did not have the problem, but as
you said, it slows the senders. A better solution was implement my own flow
control, as suggested. I have implemented a simple credit-based flow control
scheme and it solved my problem.

Thanks a lot for the explanation and suggestions.

On Tue, Sep 6, 2011 at 9:43 AM, Jeff Squyres <jsquy...@cisco.com> wrote:

> Are you overwhelming the receiver with short, unexpected messages such that
> MPI keeps mallocing and mallocing and mallocing in an attempt to eagerly
> receive all the messages?  I ask because Open MPI only eagerly sends short
> messages -- long messages are queued up at the sender and not actually
> transferred until the receiver starts to receive (aka a "rendezvous
> protocol").
>
> While that *can* happen, I'd be a little surprised if it did.  Indeed, it
> would probably take a little while for that to happen (i.e., the time
> necessary for the receiver to malloc a small amount N times, where N is
> large enough to exhaust the virtual memory on your machine, coupled with all
> the time delay to page out all the old memory and page in on-demand as Open
> MPI scans for new incoming matches... this could be pretty darn slow).  Is
> that what is happening?
>
> Are you sure that you don't have some other kind of memory error in your
> application?
>
> FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that it
> won't complete until the receiver has started to receive the message.  This
> may slow your sender down dramatically, however.  If it slows down your
> sender too much, you may have to implement your own flow control.
>
>
> On Aug 25, 2011, at 10:58 PM, Rodrigo Oliveira wrote:
>
> > Hi there,
> >
> > I am facing some problems in an Open MPI application. Part of the
> application is composed by a sender and a receiver. The problem is that the
> sender is so much faster than the receiver, what causes the receiver's
> memory to be completely used, aborting the application.
> >
> > I would like to know if there is a flow control scheme implemented in
> open mpi or if this issue have to be treated at the user application's
> layer. If exists, how it works and how can I use it in my application?
> >
> > I did some research about this subject, but I did not find a conclusive
> explanation.
> >
> > Thanks a lot.
> > _______________________________________________
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

Reply via email to