Am 13.09.2011 um 02:43 schrieb Ralph Castain: > We don't have anything similar in OMPI. There are fault tolerance modes, but > not like the one you describe.
You can join mpi3-ft at http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft, there is also an archive http://lists.mpi-forum.org/mpi3-ft/ which covers fault tolerance. I was pointed to it here http://www.open-mpi.org/community/lists/users/2011/01/15440.php -- Reuti > On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote: > >> Hi, >> >> I have implemented a simple fault tolerant ping pong C program with MPI, >> here: http://pastebin.com/7mtmQH2q >> >> MPICH2 offers a parameter with mpiexec: >> $ mpiexec -disable-auto-cleanup >> >> .. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421 >> >> It is fault tolerant in the respect that, when I ssh to one of the nodes in >> the hosts file, and kill the relevant process, the MPI job is not >> terminated. Simply, the ping will not prompt a pong from the dead node, but >> the ping-pong runs forever on the remaining live nodes. >> >> Is such an feature available for openMPI, either via mpiexec or some other >> means? >> >> >> -- >> Rob Stewart >> _______________________________________________ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > _______________________________________________ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users