> You seem like you are now sufficiently-equipped to add this doc. Any
> chance you're game to write the doc you wish you had read? :-)
Possibly. Time constraints aside, my concern is that the questions I've
asked (and the answers I was seeking) were based on the assumption that my
jobs all had u
>
> Might I suggest folding this information into the user guide?
You seem like you are now sufficiently-equipped to add this doc. Any
chance you're game to write the doc you wish you had read? :-)
Just to be absolutely clear on this: KILLING -> LOST will _never_ result in
> a reschedule? What
Also (sorry for repeated messages), what's the deal with KILLING ->
[FINISHED, FAILED]? User sends kill request but Mesos reports it's done
before it gets through so congratulations, you get to keep it?
Hussein Elgridly
Senior Software Engineer, DSDE
The Broad Institute of MIT and Harvard
On 20
>> 5. A job in the LOST state will always be rescheduled unless it went
>> through KILLING first. (What does this represent - killed by user and
then
>> lost connectivity to the slave?)
>>
> True. That is one way it could happen, it could also happen if the
> scheduler times the task out while wa
This is fantastic (and I'm glad that my understanding was mostly correct) -
thanks a lot.
Might I suggest folding this information into the user guide? Maybe it's
only relevant for my use case, but I feel like "tasks in terminal states
might be cloned and rescheduled; here's when that might happen
On Thu, Feb 19, 2015 at 1:27 PM, Hussein Elgridly <
huss...@broadinstitute.org> wrote:
> I've just spent the afternoon making a flowchart out of
> TaskStateMachine.java in an attempt to figure out what Aurora states
> actually mean. Given that all the jobs I submit have unique names and I
> don't
I've just spent the afternoon making a flowchart out of
TaskStateMachine.java in an attempt to figure out what Aurora states
actually mean. Given that all the jobs I submit have unique names and I
don't permit retries, I would like to put together a set of rules that
determine whether a job is _rea