> > Can the service just use the mesos core state abstraction? That comes > along as a free dependency setting up an aurora cluster.
If we take the separate service approach, i probably would not use the replicated log. In the scheduler, we're already contemplating moving away from it due to the amount of database reimplementation required. We would likely use an off-the-shelf RDBMS. I assume 3 is a ~wash in terms of time with productionizing new scheduler > code. The two big things we would need to build ~from scratch is storage and authentication/authorization. Also, we would need to come up with an answer for auth delegation with a separate service. -=Bill On Fri, Jul 25, 2014 at 11:56 AM, John Sirois <john.sir...@gmail.com> wrote: > Inline > > On Fri, Jul 25, 2014 at 12:41 PM, Bill Farner <wfar...@apache.org> wrote: > > > Hi all, > > > > Rolling updates of services is a crucial feature in Aurora. As such, we > > want to take great care when changing its behavior. Today, Aurora > operates > > by delegating this functionality to the client (or any API client, for > that > > matter). While this has provided a nice abstraction, it turns out there > are > > some shortcomings with this approach: > > > > 1. Visibility: since the scheduler does not know about updates, it > cannot > > display useful information about an in-progress update > > 2. Visibility: for two users to diagnose a failed update, they must be > at > > the same terminal, or copy/paste terminal output > > 3. Usability: the scheduler has no means to show information about how > an > > application's packages or configuration changed over time > > 4. Usability: update orchestration in the client means a lost > connection > > to the scheduler halts an update > > > > Some of the above issues can be addressed by moving update orchestration > to > > a service external to the scheduler. At first glance, this approach is > > attractive, as there is a firm separation of concerns. However, there > are a > > few pitfalls with this approach: > > > > 1. Usability: setup and maintenance of an aurora cluster becomes even > > more complicated (additional service + storage system) > > > > Can the service just use the mesos core state abstraction? That comes > along as a free dependency setting up an aurora cluster. > > > > 2. Usability: the user interface becomes more complicated to stitch > > together, as end-users really should only have to visit one website to > view > > job information. > > 3. Complexity: implementing a new production-ready service from scratch > > will take a non-trivial amount of time > > > > I assume 3 is a ~wash in terms of time with productionizing new scheduler > code. > > > > With these issues in mind, I propose that the scheduler take over the > > responsibility of application update orchestration. This will allow us to > > solve the current design shortcomings, without the pitfalls of the > separate > > service approach. > > > > I'm interested in thoughts others have on this. Does the reasoning seem > > sound? Are there things i'm missing? > > > > > > -=Bill > > >