Hi Jeff, thanks for the explanation. Yes, some of the earlier discussion where in fact very useful. In general I found this list to be very helpful, my thanks to everyone here who is helping people like me out.
The suggestion to use messages and non-blocking receives with MPI_Test() proved just what we needed for another portion of the code, the one where the rank 0 front-end use MPI_Comm_spawn() to launch the other codes. Maybe we can cook up something similar for the rank > 0 front-ends. The same principle should work. We just wanted to voice our experience with MPI_Barrier() on this list so that maybe in the future someone will work on implementing MPI_Barrier() differently. nick On Fri, Dec 4, 2009 at 17:37, Jeff Squyres <jsquy...@cisco.com> wrote: > On Dec 4, 2009, at 6:54 PM, Nicolas Bock wrote: > > > in our code we use a very short front-end program to drive a larger set > of codes that do our calculations. Right in the beginning of the front-end, > we have an if() statement such that only the rank 0 front-end does > something, and the other ranks go right away to an MPI_Barrier() statement, > waiting for the rank 0 front-end to finish. The rank 0 front-end then goes > ahead and does its thing by calling the other codes with MPI_Comm_spawn(). > > > > We noticed that the rank > 0 copies of the front-end consume a lot of CPU > while they are waiting at the MPI_Barrier(). This is obviously not what we > had intended. From previous discussion on this list I understand that the > CPU consumption stems from the aggressive polling frequency of the > MPI_Barrier() function. While I understand that there are a lot of > situations where a high polling frequency in MPI_Barrier() is useful, the > situation we are in is not one of them. > > Understood. > > > Is our master and slave programming model such an unusual way of using > MPI? > > Define "unusual". :-) > > MPI applications tend to be home-grown and very specific to individual > problems that they are trying to solve. *Most* MPI applications are trying > to get the lowest latency (perhaps it's more accurate to say "the most > discussed/cited MPI applications"). As such, we designed Open MPI to get > that lowest latency -- and that's by polling. :-\ I can't speak for other > MPI implementations, but I believe that most of them will do similar things. > > Some users don't necessarily want rock-bottom latency; they want features > like what you want (e.g., true blocking in MPI blocking functions). At the > moment, Open MPI doesn't cater to those requirements. We've had a lot of > discussions over the years on this specific feature and we have some pretty > good ideas how to do it. But the priority of this hasn't really ever > percolated up high enough for someone to start actually working on it. :-\ > > So I wouldn't characterize your application as "unusual" -- *every* MPI app > is unusual. I would say that your requirement for not consuming CPU while > in a blocking MPI function is in the minority. > > There were a few suggestions posted earlier today on how to be creative and > get around these implementation artifacts -- perhaps they might be useful to > you...? > > -- > Jeff Squyres > jsquy...@cisco.com > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >