(sigh) let me clarify this to resolve some offlist chatter.
It would be rather simple to implement an option that allowed an MPI job to
continue executing after the failure of one or more processes. The problem is
that OMPI's MPI layer does not yet know how to handle that situation. As Josh
ind
Actually, I honestly don't remember even having that discussion. In looking at
it, this would be relatively easy to implement if someone really wanted it.
Only issue: user would bear full responsibility for OMPI not cleaning up failed
jobs since we wouldn't terminate upon seeing a proc fail. Def
Though I do not share George's pessimism about acceptance to the Open
MPI community, it has been slightly difficult to add such a
non-standard feature to the code base for various reasons.
At ORNL, I have been developing a prototype for the MPI Forum Fault
Tolerance Working Group [1] of the Run-Th
Rob,
The Open MPI community did consider such as option, but it deemed it as
uninteresting. However, we (UTK team) have a patched version supporting several
fault tolerant modes, including the one you described in your email. If you are
interested please contact me directly.
Thanks,
geor
Am 13.09.2011 um 02:43 schrieb Ralph Castain:
> We don't have anything similar in OMPI. There are fault tolerance modes, but
> not like the one you describe.
You can join mpi3-ft at
http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft, there is also an
archive http://lists.mpi-forum.org/mpi
We don't have anything similar in OMPI. There are fault tolerance modes, but
not like the one you describe.
On Sep 12, 2011, at 5:52 PM, Rob Stewart wrote:
> Hi,
>
> I have implemented a simple fault tolerant ping pong C program with MPI,
> here: http://pastebin.com/7mtmQH2q
>
> MPICH2 offers
Hi,
I have implemented a simple fault tolerant ping pong C program with MPI,
here: http://pastebin.com/7mtmQH2q
MPICH2 offers a parameter with mpiexec:
$ mpiexec -disable-auto-cleanup
.. as described here: http://trac.mcs.anl.gov/projects/mpich2/ticket/1421
It is fault tolerant in the respec