When you receive that callback the MPI has ben put in a quiescent state. As such it does not allow MPI communication until the checkpoint is completely finished. So you cannot call barrier in the checkpoint callback. Since Open MPI did doing a coordinated checkpoint, you can assume that all processes are calling the same callback at about the same time (the coordination algorithm synchronizes them for you)
If you would like a notification callback before the quiescence protocol you might want to look at the INC callbacks: http://osl.iu.edu/research/ft/ompi-cr/api.php#api-cr_inc_register_callback They are available in the Open MPI trunk (v1.7). The OMPI_CR_INC_PRE_CRS_PRE_MPI callback will give you immediate notice, and you -should- be able to make MPI calls in that callback. I have not tried it, but conceptually it should work. If it does not, I can file a bug ticket and we can look into addressing it. -- Josh On Wed, Feb 15, 2012 at 4:23 AM, Faisal Shahzad <itsfa...@hotmail.com>wrote: > Dear Group, > > I wanted to do a synchronization check with 'MPI_Barrier(MPI_COMM_WORLD)' > in 'opal_crs_self_user_checkpoint(char **restart_cmd)' call. Although every > process is present in this call, it fails to synchronize. Is there any > reason why cant we use barrier? > Thanks in advance. > > Kind regards, > Faisal > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Joshua Hursey Postdoctoral Research Associate Oak Ridge National Laboratory http://users.nccs.gov/~jjhursey