In case you missed it, we need to update the scheduler to notice the TASK_ERROR state. I've filed https://issues.apache.org/jira/browse/AURORA-1001 to track the work for this on our end. ---------- Forwarded message ---------- From: Adam Bordelon <a...@mesosphere.io> Date: Wed, Jan 7, 2015 at 1:59 PM Subject: Re: TaskStatus source and reason fields To: "u...@mesos.apache.org" <u...@mesos.apache.org>, dev < d...@mesos.apache.org>
FYI, Mesos wasn't actually considering TASK_ERROR a terminal state until now (0.22). No impact if you weren't using it yet. commit 1c80d845431a57dd8c20e636ab7fc313602e4b49 Author: Connor Doyle <con...@mesosphere.io> Date: Wed Jan 7 13:33:16 2015 -0800 TASK_ERROR is terminal TASK_ERROR was introduced as a new terminal state, but was not handled as such in mesos::internal::protobuf::isTerminalState. Review: https://reviews.apache.org/r/29610 On Mon, Nov 10, 2014 at 2:15 PM, Dominic Hamon <dha...@twopensource.com> wrote: > I have a patch ready to land for MESOS-1143 > <https://issues.apache.org/jira/browse/MESOS-1143>. This adds TASK_ERROR > to the possible states that a framework might receive as a status update. > The semantics of TASK_LOST vs TASK_ERROR are simple: TASK_LOST means that > attempting to reschedule the task should succeed. TASK_ERROR means that any > attempt to reschedule the task will fail. This allows frameworks to make > better decisions. > > Before it lands, I'd like to solicit feedback as this changes the > semantics for frameworks (for the better!). > > Any thoughts or reservations? > > -- > Dominic Hamon | @mrdo | Twitter > *There are no bad ideas; only good ideas that go horribly wrong.* >