Possible pros to having the scheduler do the updates:

 - Scheduler likely has the most direct information with respect to
job/task SLA style metrics, and can use these to help in keeping jobs
within SLA during an update.
 - If the updates are given as "rate of change", if/when tasks fail in
large jobs, the update rate may be adjusted automatically to stay within
SLA, and possibly use a opportunistic method to upgrade a new replacement
task with the new one.

-Toby.


On Fri, Jul 25, 2014 at 11:41 AM, Bill Farner <wfar...@apache.org> wrote:

> Hi all,
>
> Rolling updates of services is a crucial feature in Aurora. As such, we
> want to take great care when changing its behavior. Today, Aurora operates
> by delegating this functionality to the client (or any API client, for that
> matter). While this has provided a nice abstraction, it turns out there are
> some shortcomings with this approach:
>
>   1. Visibility: since the scheduler does not know about updates, it cannot
> display useful information about an in-progress update
>   2. Visibility: for two users to diagnose a failed update, they must be at
> the same terminal, or copy/paste terminal output
>   3. Usability: the scheduler has no means to show information about how an
> application's packages or configuration changed over time
>   4. Usability: update orchestration in the client means a lost connection
> to the scheduler halts an update
>
> Some of the above issues can be addressed by moving update orchestration to
> a service external to the scheduler. At first glance, this approach is
> attractive, as there is a firm separation of concerns. However, there are a
> few pitfalls with this approach:
>
>   1. Usability: setup and maintenance of an aurora cluster becomes even
> more complicated (additional service + storage system)
>   2. Usability: the user interface becomes more complicated to stitch
> together, as end-users really should only have to visit one website to view
> job information.
>   3. Complexity: implementing a new production-ready service from scratch
> will take a non-trivial amount of time
>
> With these issues in mind, I propose that the scheduler take over the
> responsibility of application update orchestration. This will allow us to
> solve the current design shortcomings, without the pitfalls of the separate
> service approach.
>
> I'm interested in thoughts others have on this. Does the reasoning seem
> sound? Are there things i'm missing?
>
>
> -=Bill
>

Reply via email to