I'm using Open MPI 2.1.0 for this but I'm not sure if this is more of an Open MPI-specific implementation question or what the MPI standard guarantees.
I have an application which runs across multiple ranks, eventually reaching an MPI_Gather() call. Along the way, if one of the ranks encounters an error, it will call report the error to a log, call MPI_Finalize(), and exit with a non-zero return code. If this happens prior to the other ranks making it to the gather, it seems like mpirun notices this and the process ends on all ranks. This is what I want to happen - it's a legitimate error, so all processes should be freed up so the next job can run. It seems like if the other ranks make it into the MPI_Gather() before the one rank reports an error, the other ranks wait in the MPI_Gather() forever. Is there something simple I can do to guarantee that if any process calls MPI_Finalize(), all my ranks terminate? Thanks. -Adam
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users