Some responses inline. Would be good to get some more feedback from folks in the Google doc. I think some of the particular details may remain hazy until we begin to lay hands to code to make things more concrete.
On Mon, Mar 26, 2018 at 3:49 PM, Phillip Cloud <cpcl...@gmail.com> wrote: > On Mon, Mar 26, 2018 at 1:37 PM Wes McKinney <wesmck...@gmail.com> wrote: > >> > Huge +1 on moving some of the packaging outside the scope of >> responsibility of arrow dev, specifically I don't think we should be >> responsible for anything except wheels and conda packages. >> > > This is my ideal scenario, however unrealistic at the moment. > > >> >> In theory I agree, but until Apache Arrow grows popular enough and >> important enough for other communities to assume responsibility for >> timely packaging on standard platforms like major Linux distributions, >> we are going to have to do it, otherwise it will harm the growth of >> the community (since users will have a hard time installing the >> software on various platforms). >> >> Before changing the scope of what we are committing ourselves to do, I >> would like to see if we can develop suitable automation around the >> things we already have implemented. I don't want to write some things >> off as being "too hard" or "too much work for the Apache Arrow >> community" without giving a concerted automation effort a try. >> > > Then I think we need to have acknowledgement from committers willing to > take on responsibility for packages when they fail to build during > automated packaging and it needs to be documented in the Arrow repo. I > don't think it's reasonable for all committers to be responsible for every > package type on every platform when something goes wrong. Ideally > automation will alleviate most or all of the issues here but things will > still fail and need specific owners. > > For example, I am willing to own conda packaging for all platforms and pip > packaging for windows. I am not willing to own debian, yum, or pip > packaging for other platforms. When I say own, what I mean is that when the > automated package build fails and I get an email, I will respond ASAP with > either a fix or by contacting the appropriate person. Of course, this isn't > set in stone. If I'm able to help in other areas and I have the time, then > I will. I think the simplest way to think about packaging is as a "feature". The fact that we can "deliver Python wheels on macOS" is a feature of Apache Arrow. So it turned out that we had a feature that we were not testing often enough. Any feature in the project generally will have one or more owners, so packaging should be semantically the same from a maintenance standpoint. > > >> Note that we have other things we should be automating, but we are not: >> >> * Nightly performance benchmarking (e.g. ASV for Python) >> > * Nightly integration tests (Spark, HDFS, Dask, API docs, etc.) >> > * Running GPU-enabled tests in CI >> * Building GPU-enabled binaries >> >> > +1. Trying to satsify everyone's downstream needs is an impossible task. >> >> I think it's really easy to say "let's remove options" and "let's make >> the build system simpler" without assessing the consequences of this >> to community/project growth. The reason there are a lot of options and >> the build system has complexity is that we are trying to satisfy a >> pretty large matrix of requirements developed organically over the >> last 2+ years. >> > > It's not clear to me what is not in scope. Where do we draw the line for > testing projects downstream of arrow? How do we decide whether to test > those projects or not? Do we test them on all platforms that they support? > > IMO testing dependents of Arrow should be the responsibility of those > particular pieces of software, not the responsibility of the Arrow project. > Testing Spark and Dask, for example, both seem like the responsibility of > their respective projects. Does asking these projects to do this hurt > community/project growth in some way? > > Are there other large, successful projects that take on significant testing > of their dependents? If there are, we should look at how they have > addressed this issue. > I think at some point the burden of testing will shift more to downstream projects, but while Arrow support is experimental or new, we'll need to take a more active role in at least monitoring and reporting whether things are broken. > >> >> The central problem we are having is that our continuous integration >> and continuous delivery has not scaled to cover the diversity of use >> cases that we have accumulated. If we punt on addressing our >> automation problems and instead start removing build or packaging >> functionality to make things simpler, eventually the project will grow >> until we are dealing with a different kind of development workflow >> crisis. >> > > I don't want to punt on automation, we need to do that regardless. > > What do you think about having specific owners of packaging areas, > documented in the repo? > Yep, let's definitely do that. - Wes > >> >> On Mon, Mar 26, 2018 at 11:58 AM, Phillip Cloud <cpcl...@gmail.com> wrote: >> > Responses inline. This kind of information is extremely helpful and >> > informative. >> > >> > On Mon, Mar 26, 2018 at 11:26 AM Antoine Pitrou <anto...@python.org> >> wrote: >> > >> >> >> >> Hi, >> >> >> >> As someone who started contributing recently, I'd like to raise a few >> >> points. I hope this post doesn't come accross as rambling or clueless, >> >> otherwise feel free to ignore / point it out :-) >> >> >> >> >> >> What does a release require? >> >> ============================ >> >> >> >> I didn't find any official documentation answering that question. >> >> >> > >> https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md >> > is the one I used for the most recent release. >> > >> >> >> >> Right now, the in-line CI in the arrow repository ensures we have the >> >> following: >> >> - the source base builds fine in *some* configurations on each of the >> three >> >> major platforms (Linux, macOS, Windows) >> >> - the various test suites run fine on each of those platforms >> >> >> >> Is it a requirement that binary packages can be produced reliably for >> >> a number of platforms, and if so, which ones? Is it a requirement that >> >> binary packages are available from day one when a release is done, or >> >> is that a best effort thing depending on the availability of specific >> >> platform maintainers? It would be useful to spell that out somewhere. >> >> >> > >> > The release management doc doesn't spell out the specfic nitty gritty of >> > what exactly the artifacts do/don't should/shouldn't contain, though it >> > does contain some information about how to produce some of the artifacts. >> > It's critical that we spell this out somewhere. >> > >> > >> >> >> >> >> >> Who is responsible for producing packages? >> >> ========================================== >> >> >> >> Right now it seems packages are all produced out of a single repository >> >> "arrow-dist". That repository handles production of binary artifacts: >> >> Python wheels, Ubuntu / CentOS / Debian packages... >> >> >> >> It's not obvious if specific people are responsible for each of the >> >> package production chains. It's common in open source projects to have >> >> dedicated persons (or teams) responsible for each platform target. >> >> This ensures that 1) the packages are produced by motivated people >> >> who are familiar enough with their platforms of interest 2) producing >> >> packages does not otherwise drain the stamina of the development team. >> >> >> > >> > Huge +1 on moving some of the packaging outside the scope of >> responsibility >> > of arrow dev, specifically I don't think we should be responsible for >> > anything except wheels and conda packages. >> > >> > One question I have here is: are the separate package type >> scripts/software >> > maintained in different repositories? >> > >> > Also +1 on having a person reponsible for each platform. I wonder if >> having >> > a person responsible for a specific kind of artifact might spread the >> > workload more evenly since there's likely a shortage of Windows >> expertise. >> > >> > >> >> >> >> CI strategy >> >> =========== >> >> >> >> We have two conflicting requirements: >> >> 1) Test as much as possible as part of continuous integration >> (including, >> >> possible, the production of viable binary packages) >> >> 2) Keep CI times reasonable to avoid grinding. Some significant work >> >> was done recently to cut down our build times on Travis-CI >> >> and AppVeyor, often by half (ARROW-2071, ARROW-2083, ARROW-2231). >> >> >> >> To give a point of comparison, CPython has a two-thronged approach: >> >> >> >> 1) in-line CI using Travis-CI and AppVeyor, with simple build matrices >> >> (1 build on AppVeyor, 2 required + 2 optional on Travis-CI). In-line >> >> CI must validate for a PR to be merged. >> >> 2) out-of-line CI using a farm of buildbots: >> >> http://buildbot.python.org/all/#/grid?branch=master >> > >> > >> > Buildbot looks *a lot* better than the last time I looked at it :) >> > >> > >> >> >> >> >> >> Each buildbot has a maintainer, interested in keeping that specific >> >> platform >> >> and configuration running. Some buildbots are marked stable and >> strongly >> >> recommended to be green at all times (and especially when releasing). >> Some >> >> buildbots on the other hand are marked unstable and represent less >> >> mainstream >> >> configurations which are just "nice to fix". >> >> >> >> The take-aways here are: >> >> * Mainline development isn't throttled by the production of binary >> >> artifacts >> >> or testing on a myriad of (possible slow or busy) CI platforms. >> >> * Each tested configuration has a maintainer willing to identify and >> >> diagnose >> >> problems (either propose a solution themselves or notify the developer >> >> responsible for a regression). >> >> * Some things are release blockers (the "stable" platforms), some are >> not >> >> and just nice to have. >> >> >> > >> > IMO I would like the "stable" platforms should be conda packages for >> > arrow/pyarrow and pip wheels. We should discuss that more. >> > >> > >> >> >> >> Two side notes: >> >> * CPython is a much simpler project than Arrow, since it's C99 with >> minimal >> >> dependencies. >> >> * I wouldn't necessarily recommend buildbot as a CI platform. >> >> >> >> >> >> Build options >> >> ============= >> >> >> >> It may be useful to look into reducing the number of build options, >> and/or >> >> standardize on supported settings, per platform. For example, we should >> >> decide whether boost should be bundled or not, namespaced or not, on >> each >> >> platform. People with specific development requirements can try to >> >> override >> >> that, but with no guarantee from us. >> >> >> > >> > +1. Trying to satsify everyone's downstream needs is an impossible task. >> > >> > >> >> >> >> For example, on the llvmlite project we decided early on that we would >> >> always >> >> link LLVM statically. Third-party maintainers may decide to do things >> >> differently, but they would have to maintain their own build scripts or >> >> patches. >> >> >> >> >> >> Regards >> >> >> >> Antoine. >> >> >> >> >> >> Le 23/03/2018 à 17:58, Wes McKinney a écrit : >> >> > hi folks, >> >> > >> >> > So, I want to bring light to the problems we are having delivering >> >> > binary artifacts after Arrow releases. >> >> > >> >> > We have some amount of packaging automation implemented in >> >> > https://github.com/apache/arrow-dist using Travis CI and Appveyor to >> >> > upload packages to Bintray, a packaging hosting service. >> >> > >> >> > Unfortunately, we discovered a bunch of problems with these packaging >> >> > scripts after the release vote closed on Monday, and now 4 days later, >> >> > we still have been unable to post binaries to >> >> > https://pypi.python.org/pypi/pyarrow >> >> > >> >> > This is no one's fault, but it highlights structural problems with our >> >> > development process: >> >> > >> >> > * Why does producing packages after a release require error-prone >> manual >> >> labor? >> >> > >> >> > * Why are we only finding out about packaging problem after a release >> >> > vote closes? >> >> > >> >> > * Why is setting up nightly binary builds a brittle and bespoke >> process? >> >> > >> >> > I hope all agree that: >> >> > >> >> > * Packaging should not be a hardship or require a lot of manual labor >> >> > >> >> > * Packaging problems on the master branch should be made known within >> >> > ~24 hours, so they can be remedied immediately >> >> > >> >> > * It should be straightforward to produce binary artifacts for all >> >> > supported platforms and programming languages >> >> > >> >> > Eventually, we should include some binary artifacts in our release >> >> > votes, but we are pretty far away from suitable automation to make >> >> > this possible. >> >> > >> >> > I don't know any easy solutions, but Apache Arrow has grown widely >> >> > used enough that I think it's worth our taking the time to plan and >> >> > execute some solutions to these problems, which I expect to pay >> >> > dividends in our community's productivity over time. >> >> > >> >> > Thanks, >> >> > Wes >> >> > >> >> >>