Am 27.01.2011 um 16:10 schrieb Joshua Hursey:
>
> On Jan 27, 2011, at 9:47 AM, Reuti wrote:
>
>> Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
>>
>>> The current version of Open MPI does not support continued operation of an
>>> MPI application after process failure within a job. If a process
On Jan 27, 2011, at 9:47 AM, Reuti wrote:
> Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
>
>> The current version of Open MPI does not support continued operation of an
>> MPI application after process failure within a job. If a process dies, so
>> will the MPI job. Note that this is true of
On Jan 27, 2011, at 7:47 AM, Reuti wrote:
> Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
>
>> The current version of Open MPI does not support continued operation of an
>> MPI application after process failure within a job. If a process dies, so
>> will the MPI job. Note that this is true of
Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
> The current version of Open MPI does not support continued operation of an
> MPI application after process failure within a job. If a process dies, so
> will the MPI job. Note that this is true of many MPI implementations out
> there at the moment
The current version of Open MPI does not support continued operation of an MPI
application after process failure within a job. If a process dies, so will the
MPI job. Note that this is true of many MPI implementations out there at the
moment.
At Oak Ridge National Laboratory, we are working on
Hi,
I was wondering what support Open MPI has for allowing a job to
continue running when one or more processes in the job die
unexpectedly? Is there a special mpirun flag for this? Any other ways?
It seems obvious that collectives will fail once a process dies, but
would it be possible to create