Thanks for chiming in, everyone. We will be tracking the work with AURORA-610 [1].
[1] https://issues.apache.org/jira/browse/AURORA-610 -=Bill On Fri, Jul 25, 2014 at 6:18 PM, Maxim Khutornenko <ma...@apache.org> wrote: > Thanks for clarifying. Makes sense to me. > > On Fri, Jul 25, 2014 at 5:14 PM, Bill Farner <wfar...@apache.org> wrote: > > Only the API methods on the scheudler; i propose that the client adopt > the > > scheduler's update orchestration and we delete the equivalent code from > the > > client. > > > > -=Bill > > > > > > On Fri, Jul 25, 2014 at 3:54 PM, Maxim Khutornenko <ma...@apache.org> > wrote: > > > >> I am a bit confused. Are you suggesting we retain the current client > >> updater algorithm or only the scheduler primitives it currently > >> employs? > >> > >> On Fri, Jul 25, 2014 at 3:36 PM, Bill Farner <wfar...@apache.org> > wrote: > >> > Yeah, absolutely - we will retain AURORA-383 > >> > <https://issues.apache.org/jira/browse/AURORA-383> for that. > >> > > >> > -=Bill > >> > > >> > > >> > On Fri, Jul 25, 2014 at 2:48 PM, Brian Wickman <wick...@apache.org> > >> wrote: > >> > > >> >> The scheduler API should know when jobs are locked, though, right? > That > >> >> information could be made available to the UI. > >> >> > >> >> > >> >> On Fri, Jul 25, 2014 at 2:40 PM, Bill Farner <wfar...@apache.org> > >> wrote: > >> >> > >> >> > I think the current API primitives used for updates (kill, add) > will > >> >> > continue to make sense, so a client could implement updates that > way. > >> >> > However, these will not appear as updates to the scheduler. > >> >> > > >> >> > -=Bill > >> >> > > >> >> > > >> >> > On Fri, Jul 25, 2014 at 2:31 PM, Maxim Khutornenko < > ma...@apache.org> > >> >> > wrote: > >> >> > > >> >> > > Retaining client update algorithm would require extra work on the > >> >> > scheduler > >> >> > > side to satisfy visibility requirements Bill outlined above, > which > >> may > >> >> > not > >> >> > > worth the effort. That would also create ground for inconsistent > >> update > >> >> > > expectations and experience. > >> >> > > > >> >> > > > >> >> > > On Fri, Jul 25, 2014 at 1:34 PM, Brian Wickman < > wick...@apache.org> > >> >> > wrote: > >> >> > > > >> >> > > > Will the API for client-side updates still exist? Will the > client > >> >> > > continue > >> >> > > > to have its own implementation of 'update' (or perhaps an > 'update > >> >> > > --local' > >> >> > > > flag?) The reason I ask is whether customers should continue > to > >> have > >> >> > the > >> >> > > > flexbility to implement their own update algorithms (e.g. 1% -> > >> 10% > >> >> -> > >> >> > > 25% > >> >> > > > -> 25% -> 25% -> rest.) > >> >> > > > > >> >> > > > > >> >> > > > On Fri, Jul 25, 2014 at 11:41 AM, Bill Farner < > wfar...@apache.org > >> > > >> >> > > wrote: > >> >> > > > > >> >> > > > > Hi all, > >> >> > > > > > >> >> > > > > Rolling updates of services is a crucial feature in Aurora. > As > >> >> such, > >> >> > we > >> >> > > > > want to take great care when changing its behavior. Today, > >> Aurora > >> >> > > > operates > >> >> > > > > by delegating this functionality to the client (or any API > >> client, > >> >> > for > >> >> > > > that > >> >> > > > > matter). While this has provided a nice abstraction, it turns > >> out > >> >> > there > >> >> > > > are > >> >> > > > > some shortcomings with this approach: > >> >> > > > > > >> >> > > > > 1. Visibility: since the scheduler does not know about > >> updates, > >> >> it > >> >> > > > cannot > >> >> > > > > display useful information about an in-progress update > >> >> > > > > 2. Visibility: for two users to diagnose a failed update, > they > >> >> must > >> >> > > be > >> >> > > > at > >> >> > > > > the same terminal, or copy/paste terminal output > >> >> > > > > 3. Usability: the scheduler has no means to show > information > >> >> about > >> >> > > how > >> >> > > > an > >> >> > > > > application's packages or configuration changed over time > >> >> > > > > 4. Usability: update orchestration in the client means a > lost > >> >> > > > connection > >> >> > > > > to the scheduler halts an update > >> >> > > > > > >> >> > > > > Some of the above issues can be addressed by moving update > >> >> > > orchestration > >> >> > > > to > >> >> > > > > a service external to the scheduler. At first glance, this > >> approach > >> >> > is > >> >> > > > > attractive, as there is a firm separation of concerns. > However, > >> >> there > >> >> > > > are a > >> >> > > > > few pitfalls with this approach: > >> >> > > > > > >> >> > > > > 1. Usability: setup and maintenance of an aurora cluster > >> becomes > >> >> > even > >> >> > > > > more complicated (additional service + storage system) > >> >> > > > > 2. Usability: the user interface becomes more complicated > to > >> >> stitch > >> >> > > > > together, as end-users really should only have to visit one > >> website > >> >> > to > >> >> > > > view > >> >> > > > > job information. > >> >> > > > > 3. Complexity: implementing a new production-ready service > >> from > >> >> > > scratch > >> >> > > > > will take a non-trivial amount of time > >> >> > > > > > >> >> > > > > With these issues in mind, I propose that the scheduler take > >> over > >> >> the > >> >> > > > > responsibility of application update orchestration. This will > >> allow > >> >> > us > >> >> > > to > >> >> > > > > solve the current design shortcomings, without the pitfalls > of > >> the > >> >> > > > separate > >> >> > > > > service approach. > >> >> > > > > > >> >> > > > > I'm interested in thoughts others have on this. Does the > >> reasoning > >> >> > seem > >> >> > > > > sound? Are there things i'm missing? > >> >> > > > > > >> >> > > > > > >> >> > > > > -=Bill > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >