Summary of IRC Meeting in #aurora at Mon Oct 27 18:02:11 2014:

Attendees: davmclau, wickman, jcohen, wfarner, Yasumoto, kts, mkhutornenko, 
zmanji, dlester

- Preface
- 0.6.0 release
- Client bug bash
- Mesos eggs for more platforms
  - Action: kts to reach out to mesos dev list about providing more eggs
- Client stack trace logging
- Review bot
- All Things Hadoop podcast Aurora episode
- CI builds
  - Action: davmaclau to investigate contributing to pex


IRC log follows:

## Preface ##
[Mon Oct 27 18:02:26 2014] <wfarner>: welcome, folks.  kicking off the weekly 
community meeting
[Mon Oct 27 18:02:36 2014] <wfarner>: Let's start with roll call
[Mon Oct 27 18:02:38 2014] <wfarner>: here
[Mon Oct 27 18:02:39 2014] <jcohen>: here
[Mon Oct 27 18:03:18 2014] <dlester>: present
[Mon Oct 27 18:03:24 2014] <mkhutornenko>: here
[Mon Oct 27 18:03:28 2014] <wfarner>: while we give that a few minutes, i'd 
like to restructure this a bit by gathering topics at the beginning
[Mon Oct 27 18:04:06 2014] <Yasumoto>: howdy howdy
[Mon Oct 27 18:04:06 2014] <wfarner>: topics i would like to discuss: 0.6.0 
release, new review bot, client bug bash, and the 'all things hadoop' podcast 
about Aurora
[Mon Oct 27 18:04:19 2014] <wfarner>: please offer up any other topics now
[Mon Oct 27 18:04:20 2014] <jcohen>: I’d like to discuss whether it’s 
worthwhile to provide mesos eggs for platforms other than used by the vagrant 
image
[Mon Oct 27 18:04:28 2014] <wfarner>: jcohen: added
[Mon Oct 27 18:04:49 2014] <Yasumoto>: Can we discuss reverting the client 
logging to a file on stack-trace?
[Mon Oct 27 18:04:56 2014] <wfarner>: Yasumoto: added
[Mon Oct 27 18:05:44 2014] <wickman>: here
[Mon Oct 27 18:06:22 2014] <wfarner>: any other topics?
[Mon Oct 27 18:07:43 2014] <wfarner>: if you come up with any as we proceed, 
feel free to PM me
## 0.6.0 release ##
[Mon Oct 27 18:07:48 2014] <wfarner>: AURORA-711
[Mon Oct 27 18:08:34 2014] <wfarner>: We've finally cleared out all the planned 
work for 0.6.0, i'll begin cutting the release today and if all goes well, i 
will kick off a vote by EOD
[Mon Oct 27 18:09:01 2014] <jcohen>: great :)
[Mon Oct 27 18:09:07 2014] <wfarner>: Once the vote has started, please help 
out by running the build through the courses so we can flush out any issues 
early.
[Mon Oct 27 18:10:05 2014] <wfarner>: this is a good segue to the next topic...
## Client bug bash ##
[Mon Oct 27 18:10:35 2014] <wfarner>: to avoid scope creep of 0.6.0, we removed 
a bunch of client-related work
[Mon Oct 27 18:11:25 2014] <wfarner>: to catch up on this, we have committed to 
a client bug-fix sprint at twitter
[Mon Oct 27 18:11:41 2014] <wfarner>: you can see the somewhat-prioritized 
backlog here: 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=37&view=planning&quickFilter=156
[Mon Oct 27 18:12:36 2014] <wfarner>: we've made an effort to surface issues 
actively causing pain in the client, which we intend to pave the way for an 
0.7.0 release
[Mon Oct 27 18:13:19 2014] <wfarner>: 0.7.0 is shaping up to primarily be 
removal of many  minor deprecated features, and deprecation of the 'v1' client
## Mesos eggs for more platforms ##
[Mon Oct 27 18:13:58 2014] <wfarner>: jcohen: the floor is yours
[Mon Oct 27 18:14:40 2014] <jcohen>: So, folks trying to install Aurora on 
other platforms are running into missing eggs it seems
[Mon Oct 27 18:15:02 2014] <jcohen>: I’m wondering if it makes sense for us 
to commit to publication of a known set of eggs to make life easier
[Mon Oct 27 18:15:34 2014] <jcohen>: In theory this shouldn’t be our 
responsibility (I’d imagine Mesos themselves would want to do this)?
[Mon Oct 27 18:15:49 2014] <kts>: +1, but for a different reason
[Mon Oct 27 18:15:51 2014] <jcohen>: But if the need to build an egg is a 
blocker for people trying out Aurora it’s probably in our best interest?
[Mon Oct 27 18:15:59 2014] <kts>: I'd like to start running CI against multiple 
platforms
[Mon Oct 27 18:16:05 2014] <jcohen>: that’d be great as well
[Mon Oct 27 18:16:43 2014] <wfarner>: kts: do you have any plans for how to 
accomplish that?
[Mon Oct 27 18:17:16 2014] <kts>: None yet, ideally we'd have a CI environment 
that provided root in a container of the target OS
[Mon Oct 27 18:17:21 2014] <mkhutornenko>: It's a bit unusual to build eggs 
ourselves though. I'd expect Mesos to be a more logical owner of that process.
[Mon Oct 27 18:17:55 2014] <jcohen>: Could we provide multiple vagrant configs 
with different base boxes?
[Mon Oct 27 18:18:09 2014] <kts>: jcohen: that's what the make-mesos-eggs.sh 
script does
[Mon Oct 27 18:18:14 2014] <dlester>: has anyone brought this up on the Mesos 
mailing list?
[Mon Oct 27 18:18:17 2014] <kts>: at least to build the eggs
[Mon Oct 27 18:18:22 2014] <jcohen>: (I meant for ci)
[Mon Oct 27 18:18:34 2014] <kts>: yeah that's the most likely approach
[Mon Oct 27 18:18:48 2014] <mkhutornenko>: dlester: +1 on brining this up with 
Mesos first
[Mon Oct 27 18:18:49 2014] <wfarner>: kts: do you mind if i tag you to open the 
discussion on mesos' dev list?
[Mon Oct 27 18:19:27 2014] <wickman>: alternately we push harder for pure 
bindings.
[Mon Oct 27 18:19:44 2014] <kts>: wfarner: sure, but I don't think that should 
be a blocker to improving our own CI
[Mon Oct 27 18:19:50 2014] <wfarner>: wickman: i think that's the right 
long-term approach, but we're actively losing users with the current state
[Mon Oct 27 18:19:59 2014] <jcohen>: that’d be great as well, but it’s a 
longer term solution I suspect?
[Mon Oct 27 18:20:11 2014] <kts>: wickman: +1, ultimately we want to push this 
to pure-language bindings
[Mon Oct 27 18:20:18 2014] <wfarner>: kts: i agree, but lets not get too cozy 
with doing all of this
[Mon Oct 27 18:20:32 2014] <wfarner>: #action kts to reach out to mesos dev 
list about providing more eggs
[Mon Oct 27 18:20:41 2014] <wfarner>: s/more //
[Mon Oct 27 18:21:14 2014] <kts>: will do
## Client stack trace logging ##
[Mon Oct 27 18:21:24 2014] <wfarner>: Yasumoto: floor is yours
[Mon Oct 27 18:21:48 2014] <Yasumoto>: We've attempted to clean up the process 
for end-users when the client presents a stack trace
[Mon Oct 27 18:22:23 2014] <Yasumoto>: In practice, I've found that as a user 
it actually leads to more confusion, and then I'm winding up with a directory 
of files that stick around for a while
[Mon Oct 27 18:22:47 2014] <Yasumoto>: While running some tests, stack traces 
were being caught by the re-routing, which led to 
https://reviews.apache.org/r/26802/
[Mon Oct 27 18:23:09 2014] <Yasumoto>: but I've discarded that review, as that 
highlighted the level of patching we're really doing to make it work
[Mon Oct 27 18:23:36 2014] <Yasumoto>: I'm proposing we remove the log 
redirection for now, and re-consider the approach so we don't leave quite as 
many edge-cases hanging
[Mon Oct 27 18:25:08 2014] <wfarner>: Relevant dev@ thread that took us down 
this road: 
http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201410.mbox/%3CCAFGkSCm%2B5jJZPXmEm1%3DWNz2tSh8Ld%2BEiO2KYE6Yco%3DpB_chekQ%40mail.gmail.com%3E
[Mon Oct 27 18:26:10 2014] <mkhutornenko>: +1 on rolling it back and rethinking 
the approach to at least not hinder unit test failures.
[Mon Oct 27 18:26:57 2014] <wfarner>: let's not have lazy consensus here - i 
know there are more stakeholders on this
[Mon Oct 27 18:27:39 2014] <kts>: +1 on rolling back - my position 
(http://mail-archives.apache.org/mod_mbox/incubator-aurora-dev/201410.mbox/%3ccaaath-aoyz3srtypwi+bu5p7xvtg3+8ybfydiuz2fwuweij...@mail.gmail.com%3E)
 hasn't changed here
[Mon Oct 27 18:28:17 2014] <jcohen>: I’m +1 on rolling it back and getting a 
better implementation in place (I thought we already *had* rolled it back tbh).
[Mon Oct 27 18:28:54 2014] <wfarner>: I'm also +1, i would rather tackle the 
causes of uncaught exceptions
[Mon Oct 27 18:29:03 2014] <Yasumoto>: I just filed 
https://issues.apache.org/jira/browse/AURORA-896 if anyone wants to discuss 
afterward
[Mon Oct 27 18:29:35 2014] <Yasumoto>: Sounds like there's mainly a majority- 
at the very least to remove it in the short-term so we can re-think the 
implementation
[Mon Oct 27 18:29:41 2014] <Yasumoto>: I'll have a review out later this week
## Review bot ##
[Mon Oct 27 18:29:57 2014] <wfarner>: AURORA-883
[Mon Oct 27 18:30:20 2014] <wfarner>: last week i added a jenkins job that 
replies to code reviews with build results, so don't be surprised when you see 
these review replies
[Mon Oct 27 18:30:42 2014] <wfarner>: for example: 
https://reviews.apache.org/r/27058/
[Mon Oct 27 18:31:17 2014] <mkhutornenko>: wfarner: thanks for doing that! it's 
already proved itself useful for catching python style issues.
[Mon Oct 27 18:31:26 2014] <wfarner>: the current implementation will build 
every diff, feel free to hack on the code: 
https://github.com/apache/incubator-aurora/blob/master/build-support/jenkins/review_feedback.py
[Mon Oct 27 18:31:37 2014] <mkhutornenko>: wfarner: any chance we could 
suppress emails from it though?
[Mon Oct 27 18:32:18 2014] <wfarner>: your best bet is a client-side filter, 
reviewboard is configured to email the group on every reply, and i don't think 
it has more control than that
[Mon Oct 27 18:32:37 2014] <mkhutornenko>: that's what I thought but wanted to 
give it shot anyway :)
## All Things Hadoop podcast Aurora episode ##
[Mon Oct 27 18:36:21 2014] <wfarner>: Joe Stein (creator of All Things Hadoop 
podcast) published an episode in which he and i are chatting about Aurora.  You 
might find it interesting: 
https://twitter.com/allthingshadoop/status/526763573964701697
[Mon Oct 27 18:36:59 2014] <wfarner>: That's all i have for today, any other 
last-minute topics?
## CI builds ##
[Mon Oct 27 18:37:28 2014] <zmanji>: @wfarner: You might want to send an email 
to the dev list about the podcast
[Mon Oct 27 18:37:52 2014] <wfarner>: zmanji: will do
[Mon Oct 27 18:38:05 2014] <kts>: it looks like CI reliability has improved a 
bit since switching to pip-bootstrapped pants
[Mon Oct 27 18:38:14 2014] <kts>: however, we're still seeing python timeout 
errors: https://builds.apache.org/job/Aurora/
[Mon Oct 27 18:38:44 2014] <kts>: (but now in a later stage of the build)
[Mon Oct 27 18:39:26 2014] <kts>: we've got essentially 3 options to improve 
this
[Mon Oct 27 18:39:44 2014] <kts>: 1) make a python sdist vendor cache somewhere 
that CI can see it (probably svn.apache.org)
[Mon Oct 27 18:40:18 2014] <kts>: 2) switch to a tool that gives better control 
over the timeouts in play here
[Mon Oct 27 18:40:42 2014] <kts>: 3) contribute upstream to pants/pex to get 
better control of these timeouts
[Mon Oct 27 18:41:11 2014] <kts>: im personally leaning toward 1)
[Mon Oct 27 18:41:48 2014] <kts>: does anyone have opinions here?
[Mon Oct 27 18:41:49 2014] <Yasumoto>: I feel like that bandaids the issue.. we 
might want to consider improving the tool reliability more
[Mon Oct 27 18:41:55 2014] <Yasumoto>: (aka #3)
[Mon Oct 27 18:42:14 2014] <wfarner>: i recall some interest in using 
requirements.txt for all python dependencies, am i crossing wires or is that 
rolled up in (2)?
[Mon Oct 27 18:42:32 2014] <kts>: that would be a prerequisite to 2)
[Mon Oct 27 18:43:04 2014] <kts>: AURORA-617
[Mon Oct 27 18:43:08 2014] <davmclau>: just to clarify - since switching to 
pip-bootstrapped pants, we have seen no more problems with that part of the 
build? or just less?
[Mon Oct 27 18:43:42 2014] <kts>: builds 686 through 693 all passed with that 
change
[Mon Oct 27 18:44:13 2014] <wfarner>: our build queue, for context: 
https://builds.apache.org/job/Aurora
[Mon Oct 27 18:44:21 2014] <kts>: nothing has failed due to that
[Mon Oct 27 18:44:48 2014] <davmclau>: for (1) would we have to manually 
maintain that cache as dependencies change?
[Mon Oct 27 18:44:54 2014] <kts>: yes we would
[Mon Oct 27 18:44:55 2014] <wfarner>: some data from the other side - we had a 
build failure this morning in pex resolution
[Mon Oct 27 18:45:00 2014] <davmclau>: I'm in favor of (2) or (3) then
[Mon Oct 27 18:45:40 2014] <kts>: there's an automated solution to 1) as well - 
we could ask infra to setup a devpi instance http://doc.devpi.net/latest/
[Mon Oct 27 18:45:59 2014] <kts>: no idea how much work/what that would entail
[Mon Oct 27 18:46:07 2014] <wfarner>: kts: is AURORA-617 potentially an 
immediate improvement, without necessarily going all the way to pip?
[Mon Oct 27 18:46:09 2014] <jcohen>: Any of those options seem reasonable to 
me. I guess in order I’d say 1 (via devpi), 3, 2
[Mon Oct 27 18:46:22 2014] <kts>: wfarner: it's a functional noop
[Mon Oct 27 18:46:25 2014] <davmclau>: yes, I agree with that
[Mon Oct 27 18:46:26 2014] <wfarner>: ok
[Mon Oct 27 18:46:30 2014] <jcohen>: AURORA-617
[Mon Oct 27 18:46:42 2014] <davmclau>: (with jcohen)
[Mon Oct 27 18:46:53 2014] <kts>: basically instead of a BUILD file in 3rdparty 
there'd be a requirements.txt file
[Mon Oct 27 18:46:57 2014] <kts>: it's mostly a trivial change
[Mon Oct 27 18:48:05 2014] <davmclau>: Do we have any idea how much work (3) is?
[Mon Oct 27 18:48:20 2014] <kts>: I don't
[Mon Oct 27 18:48:36 2014] <davmclau>: okay, I can investigate
[Mon Oct 27 18:48:40 2014] <wickman>: davmclau: it will be a matter of adding 
retries to the pex resolvers
[Mon Oct 27 18:48:46 2014] <wickman>: davmclau: and then upgrading pants to use 
an upgraded version of pex
[Mon Oct 27 18:48:58 2014] <davmclau>: and then upgrading pants in our repo
[Mon Oct 27 18:49:02 2014] <wickman>: correct
[Mon Oct 27 18:50:08 2014] <kts>: #action davmaclau to investigate contributing 
to pex
[Mon Oct 27 18:50:25 2014] <kts>: *davmclau
[Mon Oct 27 18:50:32 2014] <wfarner>: Sounds like we've quiesced, going to 
close up
[Mon Oct 27 18:50:45 2014] <wfarner>: ASFBot702: meeting stop


Meeting ended at Mon Oct 27 18:50:45 2014

Reply via email to