>> Are you overwhelming the receiver with short, unexpected messages such that MPI keeps mallocing >> and mallocing and mallocing in an attempt to eagerly receive all the messages? I ask because Open >> MPI only eagerly sends short messages -- long messages are queued up at the sender and not >> actually transferred until the receiver starts to receive (aka a "rendezvous protocol").
This is probably what is happening. In general, my processes send a massive number of short messages and it is overwhelming the receivers. As I have some stage of computation (processes) much slower than others, the first ones can not handle the incoming messages in the same rate they are delivered to them. >> Are you sure that you don't have some other kind of memory error in your application? I have checked and there are not memory problems within the application. >> FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that it won't complete >> until the receiver has started to receive the message. This may slow your sender down dramatically, >> however. If it slows down your sender too much, you may have to implement your own flow control. MPI_SSEND worked for my application and I did not have the problem, but as you said, it slows the senders. A better solution was implement my own flow control, as suggested. I have implemented a simple credit-based flow control scheme and it solved my problem. Thanks a lot for the explanation and suggestions. On Tue, Sep 6, 2011 at 9:43 AM, Jeff Squyres <jsquy...@cisco.com> wrote: > Are you overwhelming the receiver with short, unexpected messages such that > MPI keeps mallocing and mallocing and mallocing in an attempt to eagerly > receive all the messages? I ask because Open MPI only eagerly sends short > messages -- long messages are queued up at the sender and not actually > transferred until the receiver starts to receive (aka a "rendezvous > protocol"). > > While that *can* happen, I'd be a little surprised if it did. Indeed, it > would probably take a little while for that to happen (i.e., the time > necessary for the receiver to malloc a small amount N times, where N is > large enough to exhaust the virtual memory on your machine, coupled with all > the time delay to page out all the old memory and page in on-demand as Open > MPI scans for new incoming matches... this could be pretty darn slow). Is > that what is happening? > > Are you sure that you don't have some other kind of memory error in your > application? > > FWIW, you can use MPI_SSEND to do a "synchronous" send, which means that it > won't complete until the receiver has started to receive the message. This > may slow your sender down dramatically, however. If it slows down your > sender too much, you may have to implement your own flow control. > > > On Aug 25, 2011, at 10:58 PM, Rodrigo Oliveira wrote: > > > Hi there, > > > > I am facing some problems in an Open MPI application. Part of the > application is composed by a sender and a receiver. The problem is that the > sender is so much faster than the receiver, what causes the receiver's > memory to be completely used, aborting the application. > > > > I would like to know if there is a flow control scheme implemented in > open mpi or if this issue have to be treated at the user application's > layer. If exists, how it works and how can I use it in my application? > > > > I did some research about this subject, but I did not find a conclusive > explanation. > > > > Thanks a lot. > > _______________________________________________ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >