Re: [OMPI users] Running on crashing nodes

2010-09-27 Thread Randolph Pullen
program with a known short string, if the monitor does not see this string prefixed on a line, it can terminate MPI, check available nodes and recast the jobĀ  accordingly Hope this helps,Randolph --- On Fri, 24/9/10, Joshua Hursey wrote: From: Joshua Hursey Subject: Re: [OMPI users] Running on

Re: [OMPI users] Running on crashing nodes

2010-09-24 Thread Joshua Hursey
As one of the Open MPI developers actively working on the MPI layer stabilization/recover feature set, I don't think we can give you a specific timeframe for availability, especially availability in a stable release. Once the initial functionality is finished, we will open it up for user testing

Re: [OMPI users] Running on crashing nodes

2010-09-24 Thread Andrei Fokau
Ralph, could you tell us when this functionality will be available in the stable version? A rough estimate will be fine. On Fri, Sep 24, 2010 at 01:24, Ralph Castain wrote: > In a word, no. If a node crashes, OMPI will abort the currently-running job > if it had processes on that node. There is

Re: [OMPI users] Running on crashing nodes

2010-09-23 Thread Ralph Castain
In a word, no. If a node crashes, OMPI will abort the currently-running job if it had processes on that node. There is no current ability to "ride-thru" such an event. That said, there is work being done to support "ride-thru". Most of that is in the current developer's code trunk, and more is com