Am 13.09.2011 um 02:43 schrieb Ralph Castain:

> We don't have anything similar in OMPI. There are fault tolerance modes, but 
> not like the one you describe.

You can join mpi3-ft at 
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft, there is also an 
archive http://lists.mpi-forum.org/mpi3-ft/ which covers fault tolerance.

I was pointed to it here 
http://www.open-mpi.org/community/lists/users/2011/01/15440.php

-- Reuti


> On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:
> 
>> Hi,
>> 
>> I have implemented a simple fault tolerant ping pong C program with MPI, 
>> here: http://pastebin.com/7mtmQH2q
>> 
>> MPICH2 offers a parameter with mpiexec:
>> $ mpiexec -disable-auto-cleanup
>> 
>> .. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421
>> 
>> It is fault tolerant in the respect that, when I ssh to one of the nodes in 
>> the hosts file, and kill the relevant process, the MPI job is not 
>> terminated. Simply, the ping will not prompt a pong from the dead node, but 
>> the ping-pong runs forever on the remaining live nodes.
>> 
>> Is such an feature available for openMPI, either via mpiexec or some other 
>> means?
>> 
>> 
>> -- 
>> Rob Stewart
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to