Retaining client update algorithm would require extra work on the scheduler side to satisfy visibility requirements Bill outlined above, which may not worth the effort. That would also create ground for inconsistent update expectations and experience.
On Fri, Jul 25, 2014 at 1:34 PM, Brian Wickman <wick...@apache.org> wrote: > Will the API for client-side updates still exist? Will the client continue > to have its own implementation of 'update' (or perhaps an 'update --local' > flag?) The reason I ask is whether customers should continue to have the > flexbility to implement their own update algorithms (e.g. 1% -> 10% -> 25% > -> 25% -> 25% -> rest.) > > > On Fri, Jul 25, 2014 at 11:41 AM, Bill Farner <wfar...@apache.org> wrote: > > > Hi all, > > > > Rolling updates of services is a crucial feature in Aurora. As such, we > > want to take great care when changing its behavior. Today, Aurora > operates > > by delegating this functionality to the client (or any API client, for > that > > matter). While this has provided a nice abstraction, it turns out there > are > > some shortcomings with this approach: > > > > 1. Visibility: since the scheduler does not know about updates, it > cannot > > display useful information about an in-progress update > > 2. Visibility: for two users to diagnose a failed update, they must be > at > > the same terminal, or copy/paste terminal output > > 3. Usability: the scheduler has no means to show information about how > an > > application's packages or configuration changed over time > > 4. Usability: update orchestration in the client means a lost > connection > > to the scheduler halts an update > > > > Some of the above issues can be addressed by moving update orchestration > to > > a service external to the scheduler. At first glance, this approach is > > attractive, as there is a firm separation of concerns. However, there > are a > > few pitfalls with this approach: > > > > 1. Usability: setup and maintenance of an aurora cluster becomes even > > more complicated (additional service + storage system) > > 2. Usability: the user interface becomes more complicated to stitch > > together, as end-users really should only have to visit one website to > view > > job information. > > 3. Complexity: implementing a new production-ready service from scratch > > will take a non-trivial amount of time > > > > With these issues in mind, I propose that the scheduler take over the > > responsibility of application update orchestration. This will allow us to > > solve the current design shortcomings, without the pitfalls of the > separate > > service approach. > > > > I'm interested in thoughts others have on this. Does the reasoning seem > > sound? Are there things i'm missing? > > > > > > -=Bill > > >