Summary of IRC Meeting in #aurora at Mon Sep 8 18:01:29 2014: Attendees: lexinator, wickman, jfarrell, mchucarroll, wfarner, jcohen, kts, jaybuff, mkhutornenko, dlester
- Preface - AURORA-351 IRC log follows: ## Preface ## [Mon Sep 8 18:01:44 2014] <wfarner>: roll call? [Mon Sep 8 18:01:45 2014] <wfarner>: here [Mon Sep 8 18:02:43 2014] <wickman>: aye [Mon Sep 8 18:02:57 2014] <mchucarroll>: here [Mon Sep 8 18:03:01 2014] <jaybuff>: hi [Mon Sep 8 18:03:11 2014] <mkhutornenko>: here [Mon Sep 8 18:03:13 2014] <dlester>: morning [Mon Sep 8 18:03:28 2014] <lexinator>: alive [Mon Sep 8 18:04:28 2014] <jcohen>: here [Mon Sep 8 18:05:02 2014] <kts>: here [Mon Sep 8 18:05:10 2014] <wfarner>: i'd like to first mention that we're starting to think about proposals for this winter's OPW (Outreach Program for Women) [Mon Sep 8 18:05:12 2014] <wfarner>: https://github.com/twitter/twitter.github.com/wiki/Outreach-Program-for-Women-(Winter-2014) [Mon Sep 8 18:06:08 2014] <wfarner>: if anybody has an idea, please email the dev list, or contact me personally. this could be work already ticketed, or brand new ideas [Mon Sep 8 18:07:20 2014] <wfarner>: as for what we've been up to over the last week, here's a peek at the JIRA delta: [Mon Sep 8 18:07:21 2014] <wfarner>: https://issues.apache.org/jira/browse/AURORA-690?jql=project%20%3D%20AURORA%20and%20updatedDate%3E%3D-7d [Mon Sep 8 18:07:41 2014] <wfarner>: there's been a decent amount of build trouble [Mon Sep 8 18:08:30 2014] <wfarner>: i hope we can prevent a good deal of this in the future by switching to pure-python/java implementations of the mesos driver (libmesos.so) [Mon Sep 8 18:08:49 2014] <wfarner>: though we are not actively working on that at the moment [Mon Sep 8 18:09:27 2014] <jaybuff>: using the pure python libmesos for the executor would solve some of my headaches with running the executor in a docker container (see AURORA-633) [Mon Sep 8 18:09:44 2014] <wickman>: tarnfeld has been doing a bit of work tightening up pesos [Mon Sep 8 18:09:49 2014] <jaybuff>: is that just a matter of switching it out, or does more work need to be done on pesos? [Mon Sep 8 18:09:50 2014] <wickman>: the portainer stuff uses it. [Mon Sep 8 18:09:55 2014] <lexinator>: who is going to maintain the java/python implementations? the mesos guys? [Mon Sep 8 18:10:26 2014] <wickman>: jaybuff: it's unclear -- i wouldn't feel comfortable without considerably more testing. [Mon Sep 8 18:10:27 2014] <wfarner>: lexinator: i believe we would. there has been talk about this on the mesos side as well [Mon Sep 8 18:10:57 2014] <jaybuff>: okay, so at a minimum pesos needs better unit test coverage before the executor can use it? [Mon Sep 8 18:10:59 2014] <wickman>: jaybuff: the pesos+executor api has a little less surface area so might be safer overall, but still needs testing. [Mon Sep 8 18:10:59 2014] <lexinator>: it would seem as soon as mesos 0.21, these implementations would lag behind. [Mon Sep 8 18:11:08 2014] <wfarner>: the mesos devs do want 'vetted' implementations, and probably a canonical implementation, but i suspect that would come after some momentum builds around one [Mon Sep 8 18:11:27 2014] <wickman>: jaybuff: the unit test coverage is pretty good but that's often not representative of real life [Mon Sep 8 18:11:39 2014] <jaybuff>: i see [Mon Sep 8 18:11:44 2014] <wfarner>: lexinator: you are right that the 'API' is a bit of a moving target right now [Mon Sep 8 18:12:35 2014] <lexinator>: i think that will always be the case =] [Mon Sep 8 18:12:45 2014] <wfarner>: however, i'd bet we've spent enough time battling build issues over the past week to justify maintaining this code [Mon Sep 8 18:12:53 2014] <wfarner>: s/week/month/ [Mon Sep 8 18:13:14 2014] <wickman>: one issue with pesos is that it is wickman/pesos [Mon Sep 8 18:13:28 2014] <wickman>: and there is a lot of external contribution that i don't have enough time to incorporate/pull. [Mon Sep 8 18:13:34 2014] <wickman>: i'd love to see pesos go to a shared ownership model [Mon Sep 8 18:13:35 2014] <wfarner>: https://github.com/wickman/pesos [Mon Sep 8 18:13:40 2014] <wfarner>: https://github.com/kevints/mesos-framework-api [Mon Sep 8 18:14:09 2014] <wfarner>: ditto for m-f-a [Mon Sep 8 18:14:56 2014] <dlester>: there is a mesos github org https://github.com/mesos that could potentially be used as a future "home" for these projects if you'd like to eventually move them to something independent [Mon Sep 8 18:14:58 2014] <wfarner>: i think it's safe to say that getting these feature complete and integrated with aurora will be a great way to start the discussion of other contributors [Mon Sep 8 18:15:08 2014] <wickman>: dlester: we talked about this...but possibly mesos-bindings [Mon Sep 8 18:15:34 2014] <wickman>: dlester: the issue boils down to what wfarner alluded to earlier regarding 'vetted' implementations...i imagine they don't want the top-level mesos to become a dumping ground for everything. [Mon Sep 8 18:15:45 2014] <dlester>: got it [Mon Sep 8 18:16:22 2014] <wfarner>: wickman: exactly, and i would rather a winner to be chosen more democratically than "we sit next to the right people" :-) [Mon Sep 8 18:17:03 2014] <wfarner>: any last words on that topic? [Mon Sep 8 18:18:03 2014] <wfarner>: moving along, we've made good progress on AURORA-610 [Mon Sep 8 18:18:36 2014] <wfarner>: davmclau posted some UI mocks: https://reviews.apache.org/r/25158/ [Mon Sep 8 18:19:20 2014] <jaybuff>: i like it! [Mon Sep 8 18:19:28 2014] <wfarner>: the last big chunk is on me for some remaining scheduler-side logic, i'll be spending today (and likely tomorrow) to get to 100% test coverage [Mon Sep 8 18:19:51 2014] <jaybuff>: this UI is all read only, right? [Mon Sep 8 18:19:59 2014] <wfarner>: that's correct [Mon Sep 8 18:20:13 2014] <wickman>: wfarner: is that just for now though? [Mon Sep 8 18:20:31 2014] <wfarner>: we really want to follow up with a play/pause button [Mon Sep 8 18:20:51 2014] <wfarner>: but there's groundwork to be laid for authentication in the scheduler web UI [Mon Sep 8 18:20:53 2014] <kts>: that sounds like a good segway into AURORA-351 [Mon Sep 8 18:21:06 2014] <wickman>: kts: segue </grammar police> [Mon Sep 8 18:21:16 2014] <wfarner>: kts: indeed [Mon Sep 8 18:21:24 2014] <mchucarroll>: (When we talk about an issue, can we add the one-line discussion, instead of making everyone hit jira at once?) [Mon Sep 8 18:21:25 2014] <wfarner>: one last thing on updates though: [Mon Sep 8 18:21:40 2014] <kts>: ASFBot: help [Mon Sep 8 18:22:43 2014] <wfarner>: we're going to be adding support for an external entity to monitor in-progress updates, anybody interested may want to follow along as that is fleshed out [Mon Sep 8 18:23:14 2014] <wfarner>: AURORA-690: Add support for external update coordination [Mon Sep 8 18:23:19 2014] <kts>: AURORA-351 - Consider using Apache Shiro for scheduler Authentication and Authorization [Mon Sep 8 18:23:52 2014] <wickman>: wfarner: the use-case specifically being..? [Mon Sep 8 18:25:01 2014] <wfarner>: ultimately it will allow custom external infrastructure to pause in-progress updates when problems are noticed [Mon Sep 8 18:25:07 2014] <kts>: wickman: I think Maxim did a good job capturing the use case on the ticket, but the main one is an external (application-specific) monitoring service [Mon Sep 8 18:25:30 2014] <wfarner>: a problem, for example, being something like an alert firing on the service [Mon Sep 8 18:25:31 2014] <wickman>: ok, great! [Mon Sep 8 18:25:37 2014] <jfarrell>: afternoon everyone [Mon Sep 8 18:25:50 2014] <wfarner>: jfarrell: afternoon! [Mon Sep 8 18:26:02 2014] <wfarner>: kts: feel free to shift discussion to -351 now ## AURORA-351 ## [Mon Sep 8 18:27:08 2014] <jaybuff>: just FYI, we have a mostly working authen/authz implementation internally, but we haven't got approval to open source it yet. [Mon Sep 8 18:27:21 2014] <kts>: I looked at wiring in shiro, both as a replacement for our current Capability system (that provides authorization) and as an authentication system [Mon Sep 8 18:27:24 2014] <lexinator>: have you guys thought about using mesos authorization stuff? [Mon Sep 8 18:27:40 2014] <lexinator>: http://mesos.apache.org/documentation/latest/authorization/ [Mon Sep 8 18:28:50 2014] <kts>: so there's a lot of term overload there, so i'll try to explain the mapping as i understand it [Mon Sep 8 18:28:52 2014] <wfarner>: lexinator: this has come up, and is a good idea. the way the API has shaped up today, though, doesn't give us a way to perform arbitrary auth challenges [Mon Sep 8 18:29:28 2014] <lexinator>: we are likely to go down this path due to wanting to support multiple frameworks. [Mon Sep 8 18:29:57 2014] <wfarner>: i would be all in favor for seeing that API broadened to support use cases like this [Mon Sep 8 18:30:12 2014] <jaybuff>: i had a back and forth with vinod about pushing user auth into mesos and he was against it. he wants to auth a framework, then delegate user auth to the framework [Mon Sep 8 18:30:53 2014] <kts>: ive looked at shiro mostly in the context of aurora authenticating its own users and authorizing aurora-specific actions [Mon Sep 8 18:31:09 2014] <wfarner>: jaybuff: it would be great to see that conversation in a public forum; it would give users a better idea what the non-goals are for mesos authorization [Mon Sep 8 18:31:26 2014] <wfarner>: while semi-redundant, would you be willing to kick off that kind of thread on mesos users list? [Mon Sep 8 18:31:38 2014] <jaybuff>: wfarner: totally [Mon Sep 8 18:31:43 2014] <jaybuff>: i'll write something today [Mon Sep 8 18:31:44 2014] <wfarner>: many many thanks [Mon Sep 8 18:31:46 2014] <kts>: one example is "maintenance:create:example-slave-01" [Mon Sep 8 18:31:55 2014] <kts>: which is an example of an aurora-specific permission [Mon Sep 8 18:32:55 2014] <kts>: in the proof-of-concept world an aurora user (mapping to a shiro subject, authenticated via HTTP basic auth) can be assigned to a shiro role, which has a set of shiro permissions (like maintenance:*:*) [Mon Sep 8 18:35:37 2014] <kts>: configuration of this happens either via a shiro.ini (flat text mapping), or implementation/configuration of a Shiro Realm (which just means extending an abstract JDBC or JNDI realm and providing URL/schema mapping functions) [Mon Sep 8 18:36:02 2014] <kts>: authentication could also be replaced with (for example) SPNEGO [Mon Sep 8 18:36:11 2014] <jaybuff>: kts: we just did simple list of authed users can launch/kill tasks as an aurora role. sounds like you want finer grained controls. [Mon Sep 8 18:37:04 2014] <kts>: jaybuff: in the simple config case you'd just provide a list of users that can do that [Mon Sep 8 18:37:13 2014] <wickman>: we have the need internally for users/principals with elevated privileges effectively [Mon Sep 8 18:38:06 2014] <jaybuff>: *nod* we have no story around authorizing any of the aurora_admin commands [Mon Sep 8 18:39:13 2014] <wfarner>: jaybuff: there's _sort of_ a story (twitter has one), but it's not for the faint of heart [Mon Sep 8 18:39:31 2014] <wfarner>: but nonetheless, i want the story kevin mentions instead :-) [Mon Sep 8 18:39:55 2014] <kts>: wfarner: twitter will still be free to make their story as complex as it wants (through implementation of a custom shiro Realm) [Mon Sep 8 18:40:03 2014] <wfarner>: har har [Mon Sep 8 18:40:18 2014] <wfarner>: any final words on this topic? any other topics before we wrap up? [Mon Sep 8 18:40:25 2014] <jaybuff>: haha. we'll probably do IP whitelisting (only allow localhost) for certain URI paths [Mon Sep 8 18:40:37 2014] <jaybuff>: lame, but it'll get us something [Mon Sep 8 18:41:11 2014] <jaybuff>: we have the docker stuff working [Mon Sep 8 18:41:14 2014] <jaybuff>: AURORA-633 [Mon Sep 8 18:41:15 2014] <kts>: jaybuff: curious how you're doing it now? running a reverse proxy in front of aurora and delegating to that? [Mon Sep 8 18:42:10 2014] <jaybuff>: the docker stuff has three or four small issues i'll note in the JIRA [Mon Sep 8 18:42:42 2014] <jaybuff>: kts: for CLI we're doing SPEGNO, for webui we're doing reverse proxy [Mon Sep 8 18:42:42 2014] <kts>: that's all I have on that topic, I'll keep the ticket updated with any new developments [Mon Sep 8 18:44:00 2014] <jaybuff>: (at least that's my understanding of the auth stuff, lex did most of the work, so he has more details) [Mon Sep 8 18:44:11 2014] <wickman>: jaybuff: so are you bundling thermos into your docker images? [Mon Sep 8 18:45:04 2014] <jaybuff>: wickman: no, we're mounting /lib64 et al into the container and relying on the container having a python [Mon Sep 8 18:45:23 2014] <jaybuff>: and using LD_LIBRARY_PATH so ./aurora_executor can find libmesos.so [Mon Sep 8 18:45:40 2014] <wickman>: wow ok. interesting. [Mon Sep 8 18:45:56 2014] <jaybuff>: it works, but requires that the scheduler know where library paths on slaves are [Mon Sep 8 18:45:58 2014] <wickman>: so you're also bind mounting in the directory(ies) that contain aurora components? [Mon Sep 8 18:46:06 2014] <wfarner>: i'll let you guys continue this offline, going to wrap up the meeting [Mon Sep 8 18:46:08 2014] <wfarner>: ASFBot: meeting stop Meeting ended at Mon Sep 8 18:46:08 2014