Re: [OMPI users] CPU burning in Wait state

George Bosilca Wed, 3 Sep 2008 13:48:03 -0400

This program is 100% correct from MPI perspective. However, in Open MPI (and I think most of the others MPI), a collective communication is something that will drain most of the resources, similar to all blocking functions.

Now I will answer to your original post. Using non blocking communications in this particular case, will give you a benefit as the data involved in the communications is small enough to achieve a perfect overlap. In the case you're trying to do exactly the same with larger data, using non blocking communications will negatively impact the performances, as MPI is not supposed to communicate when the user application is not in an MPI call.


  george.

On Sep 3, 2008, at 6:32 PM, Vincent Rotival wrote:

Ok let's take the simple example here, I might have use wrong terms and I apologize for it
While the rank 0 process is sleeping the other ones are in bcast waiting for data
program test
use mpi
implicit none

integer :: mpi_wsize, mpi_rank, mpi_err
integer :: data

call mpi_init(mpi_err)
call mpi_comm_size(MPI_COMM_WORLD, mpi_wsize, mpi_err)
call mpi_comm_rank(MPI_COMM_WORLD, mpi_rank, mpi_err)
if(mpi_rank.eq.0) then
   call sleep(100)
   data = 10
end if

call mpi_bcast(data, 1, MPI_INTEGER, 0, MPI_COMM_WORLD, mpi_err)

print *, "Done in #", mpi_rank, " => data=", data

end program test


George Bosilca wrote:
On Sep 3, 2008, at 6:11 PM, Vincent Rotival wrote:
Eugene,

No what I'd like is that when doing something like

call mpi_bcast(data, 1, MPI_INTEGER, 0, .....)
the program continues AFTER the Bcast is completed (so no control returned to user), but while threads with rank > 0 are waiting in Bcast they are not taking CPU resources
Threads with rank > 0 ? Now, this scares me !!! If all your threads are going in the bcast, then I guess the application is not correct from the MPI standard perspective (i.e. on each communicator there is only one collective at every moment). In MPI, each process (and not each thread) has a rank, and each process exists in each communicator only once. In other words, as each collective is bounded to a specific communicator, on each of your processes, only one thread should go in the MPI_Bcast, if you want only ONE collective.
 george.
I hope it is more clear, I apologize for not being clear in the first place
Vincent



Eugene Loh wrote:
Vincent Rotival wrote:
The solution I retained was for the main thread to isend data separately to each other threads that are using Irecv + loop on mpi_test to test the finish of the Irecv. It mught be dirty but works much better than using Bcast
Thanks for the clarification.
But this strikes me more as a question about the MPI standard than about the Open MPI implementation. That is, what you really want is for the MPI API to support a non-blocking form of collectives. You want control to return to the user program before the barrier/bcast/etc. operation has completed. That's an API change.
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
------------------------------------------------------------------------

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

smime.p7s
Description: S/MIME cryptographic signature

Re: [OMPI users] CPU burning in Wait state

Reply via email to