I'm using Open MPI 2.1.0 for this but I'm not sure if this is more of an
Open MPI-specific implementation question or what the MPI standard
guarantees.

I have an application which runs across multiple ranks, eventually reaching
an MPI_Gather() call.  Along the way, if one of the ranks encounters an
error, it will call report the error to a log, call MPI_Finalize(), and
exit with a non-zero return code.  If this happens prior to the other ranks
making it to the gather, it seems like mpirun notices this and the process
ends on all ranks.  This is what I want to happen - it's a legitimate
error, so all processes should be freed up so the next job can run.  It
seems like if the other ranks make it into the MPI_Gather() before the one
rank reports an error, the other ranks wait in the MPI_Gather() forever.

Is there something simple I can do to guarantee that if any process calls
MPI_Finalize(), all my ranks terminate?

Thanks.
-Adam
_______________________________________________
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Reply via email to