Rob,

The Open MPI community did consider such as option, but it deemed it as 
uninteresting. However, we (UTK team) have a patched version supporting several 
fault tolerant modes, including the one you described in your email. If you are 
interested please contact me directly.

  Thanks,
    george.


On Sep 12, 2011, at 20:43 , Ralph Castain wrote:

> We don't have anything similar in OMPI. There are fault tolerance modes, but 
> not like the one you describe.
> 
> On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:
> 
>> Hi,
>> 
>> I have implemented a simple fault tolerant ping pong C program with MPI, 
>> here: http://pastebin.com/7mtmQH2q
>> 
>> MPICH2 offers a parameter with mpiexec:
>> $ mpiexec -disable-auto-cleanup
>> 
>> .. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421
>> 
>> It is fault tolerant in the respect that, when I ssh to one of the nodes in 
>> the hosts file, and kill the relevant process, the MPI job is not 
>> terminated. Simply, the ping will not prompt a pong from the dead node, but 
>> the ping-pong runs forever on the remaining live nodes.
>> 
>> Is such an feature available for openMPI, either via mpiexec or some other 
>> means?
>> 
>> 
>> -- 
>> Rob Stewart
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to