Hi, As someone who started contributing recently, I'd like to raise a few points. I hope this post doesn't come accross as rambling or clueless, otherwise feel free to ignore / point it out :-)
What does a release require? ============================ I didn't find any official documentation answering that question. Right now, the in-line CI in the arrow repository ensures we have the following: - the source base builds fine in *some* configurations on each of the three major platforms (Linux, macOS, Windows) - the various test suites run fine on each of those platforms Is it a requirement that binary packages can be produced reliably for a number of platforms, and if so, which ones? Is it a requirement that binary packages are available from day one when a release is done, or is that a best effort thing depending on the availability of specific platform maintainers? It would be useful to spell that out somewhere. Who is responsible for producing packages? ========================================== Right now it seems packages are all produced out of a single repository "arrow-dist". That repository handles production of binary artifacts: Python wheels, Ubuntu / CentOS / Debian packages... It's not obvious if specific people are responsible for each of the package production chains. It's common in open source projects to have dedicated persons (or teams) responsible for each platform target. This ensures that 1) the packages are produced by motivated people who are familiar enough with their platforms of interest 2) producing packages does not otherwise drain the stamina of the development team. CI strategy =========== We have two conflicting requirements: 1) Test as much as possible as part of continuous integration (including, possible, the production of viable binary packages) 2) Keep CI times reasonable to avoid grinding. Some significant work was done recently to cut down our build times on Travis-CI and AppVeyor, often by half (ARROW-2071, ARROW-2083, ARROW-2231). To give a point of comparison, CPython has a two-thronged approach: 1) in-line CI using Travis-CI and AppVeyor, with simple build matrices (1 build on AppVeyor, 2 required + 2 optional on Travis-CI). In-line CI must validate for a PR to be merged. 2) out-of-line CI using a farm of buildbots: http://buildbot.python.org/all/#/grid?branch=master Each buildbot has a maintainer, interested in keeping that specific platform and configuration running. Some buildbots are marked stable and strongly recommended to be green at all times (and especially when releasing). Some buildbots on the other hand are marked unstable and represent less mainstream configurations which are just "nice to fix". The take-aways here are: * Mainline development isn't throttled by the production of binary artifacts or testing on a myriad of (possible slow or busy) CI platforms. * Each tested configuration has a maintainer willing to identify and diagnose problems (either propose a solution themselves or notify the developer responsible for a regression). * Some things are release blockers (the "stable" platforms), some are not and just nice to have. Two side notes: * CPython is a much simpler project than Arrow, since it's C99 with minimal dependencies. * I wouldn't necessarily recommend buildbot as a CI platform. Build options ============= It may be useful to look into reducing the number of build options, and/or standardize on supported settings, per platform. For example, we should decide whether boost should be bundled or not, namespaced or not, on each platform. People with specific development requirements can try to override that, but with no guarantee from us. For example, on the llvmlite project we decided early on that we would always link LLVM statically. Third-party maintainers may decide to do things differently, but they would have to maintain their own build scripts or patches. Regards Antoine. Le 23/03/2018 à 17:58, Wes McKinney a écrit : > hi folks, > > So, I want to bring light to the problems we are having delivering > binary artifacts after Arrow releases. > > We have some amount of packaging automation implemented in > https://github.com/apache/arrow-dist using Travis CI and Appveyor to > upload packages to Bintray, a packaging hosting service. > > Unfortunately, we discovered a bunch of problems with these packaging > scripts after the release vote closed on Monday, and now 4 days later, > we still have been unable to post binaries to > https://pypi.python.org/pypi/pyarrow > > This is no one's fault, but it highlights structural problems with our > development process: > > * Why does producing packages after a release require error-prone manual > labor? > > * Why are we only finding out about packaging problem after a release > vote closes? > > * Why is setting up nightly binary builds a brittle and bespoke process? > > I hope all agree that: > > * Packaging should not be a hardship or require a lot of manual labor > > * Packaging problems on the master branch should be made known within > ~24 hours, so they can be remedied immediately > > * It should be straightforward to produce binary artifacts for all > supported platforms and programming languages > > Eventually, we should include some binary artifacts in our release > votes, but we are pretty far away from suitable automation to make > this possible. > > I don't know any easy solutions, but Apache Arrow has grown widely > used enough that I think it's worth our taking the time to plan and > execute some solutions to these problems, which I expect to pay > dividends in our community's productivity over time. > > Thanks, > Wes >