Re: Confronting Arrow packaging problems

Phillip Cloud Mon, 26 Mar 2018 09:02:27 -0700

Responses inline. This kind of information is extremely helpful and
informative.


On Mon, Mar 26, 2018 at 11:26 AM Antoine Pitrou <anto...@python.org> wrote:

>
> Hi,
>
> As someone who started contributing recently, I'd like to raise a few
> points.  I hope this post doesn't come accross as rambling or clueless,
> otherwise feel free to ignore / point it out :-)
>
>
> What does a release require?
> ============================
>
> I didn't find any official documentation answering that question.
>
https://github.com/apache/arrow/blob/master/dev/release/RELEASE_MANAGEMENT.md
is the one I used for the most recent release.

>
> Right now, the in-line CI in the arrow repository ensures we have the
> following:
> - the source base builds fine in *some* configurations on each of the three
>   major platforms (Linux, macOS, Windows)
> - the various test suites run fine on each of those platforms
>
> Is it a requirement that binary packages can be produced reliably for
> a number of platforms, and if so, which ones?  Is it a requirement that
> binary packages are available from day one when a release is done, or
> is that a best effort thing depending on the availability of specific
> platform maintainers?  It would be useful to spell that out somewhere.
>

The release management doc doesn't spell out the specfic nitty gritty of
what exactly the artifacts do/don't should/shouldn't contain, though it
does contain some information about how to produce some of the artifacts.
It's critical that we spell this out somewhere.


>
>
> Who is responsible for producing packages?
> ==========================================
>
> Right now it seems packages are all produced out of a single repository
> "arrow-dist".  That repository handles production of binary artifacts:
> Python wheels, Ubuntu / CentOS / Debian packages...
>
> It's not obvious if specific people are responsible for each of the
> package production chains.  It's common in open source projects to have
> dedicated persons (or teams) responsible for each platform target.
> This ensures that 1) the packages are produced by motivated people
> who are familiar enough with their platforms of interest 2) producing
> packages does not otherwise drain the stamina of the development team.
>

Huge +1 on moving some of the packaging outside the scope of responsibility
of arrow dev, specifically I don't think we should be responsible for
anything except wheels and conda packages.

One question I have here is: are the separate package type scripts/software
maintained in different repositories?

Also +1 on having a person reponsible for each platform. I wonder if having
a person responsible for a specific kind of artifact might spread the
workload more evenly since there's likely a shortage of Windows expertise.


>
> CI strategy
> ===========
>
> We have two conflicting requirements:
> 1) Test as much as possible as part of continuous integration (including,
>    possible, the production of viable binary packages)
> 2) Keep CI times reasonable to avoid grinding.  Some significant work
>    was done recently to cut down our build times on Travis-CI
>    and AppVeyor, often by half (ARROW-2071, ARROW-2083, ARROW-2231).
>
> To give a point of comparison, CPython has a two-thronged approach:
>
> 1) in-line CI using Travis-CI and AppVeyor, with simple build matrices
>    (1 build on AppVeyor, 2 required + 2 optional on Travis-CI).  In-line
>    CI must validate for a PR to be merged.
> 2) out-of-line CI using a farm of buildbots:
>    http://buildbot.python.org/all/#/grid?branch=master


Buildbot looks *a lot* better than the last time I looked at it :)


>
>
> Each buildbot has a maintainer, interested in keeping that specific
> platform
> and configuration running.  Some buildbots are marked stable and strongly
> recommended to be green at all times (and especially when releasing).  Some
> buildbots on the other hand are marked unstable and represent less
> mainstream
> configurations which are just "nice to fix".
>
> The take-aways here are:
> * Mainline development isn't throttled by the production of binary
> artifacts
>   or testing on a myriad of (possible slow or busy) CI platforms.
> * Each tested configuration has a maintainer willing to identify and
> diagnose
>   problems (either propose a solution themselves or notify the developer
>   responsible for a regression).
> * Some things are release blockers (the "stable" platforms), some are not
>   and just nice to have.
>

IMO I would like the "stable" platforms should be conda packages for
arrow/pyarrow and pip wheels. We should discuss that more.


>
> Two side notes:
> * CPython is a much simpler project than Arrow, since it's C99 with minimal
>   dependencies.
> * I wouldn't necessarily recommend buildbot as a CI platform.
>
>
> Build options
> =============
>
> It may be useful to look into reducing the number of build options, and/or
> standardize on supported settings, per platform.  For example, we should
> decide whether boost should be bundled or not, namespaced or not, on each
> platform.  People with specific development requirements can try to
> override
> that, but with no guarantee from us.
>

+1. Trying to satsify everyone's downstream needs is an impossible task.


>
> For example, on the llvmlite project we decided early on that we would
> always
> link LLVM statically.  Third-party maintainers may decide to do things
> differently, but they would have to maintain their own build scripts or
> patches.
>
>
> Regards
>
> Antoine.
>
>
> Le 23/03/2018 à 17:58, Wes McKinney a écrit :
> > hi folks,
> >
> > So, I want to bring light to the problems we are having delivering
> > binary artifacts after Arrow releases.
> >
> > We have some amount of packaging automation implemented in
> > https://github.com/apache/arrow-dist using Travis CI and Appveyor to
> > upload packages to Bintray, a packaging hosting service.
> >
> > Unfortunately, we discovered a bunch of problems with these packaging
> > scripts after the release vote closed on Monday, and now 4 days later,
> > we still have been unable to post binaries to
> > https://pypi.python.org/pypi/pyarrow
> >
> > This is no one's fault, but it highlights structural problems with our
> > development process:
> >
> > * Why does producing packages after a release require error-prone manual
> labor?
> >
> > * Why are we only finding out about packaging problem after a release
> > vote closes?
> >
> > * Why is setting up nightly binary builds a brittle and bespoke process?
> >
> > I hope all agree that:
> >
> > * Packaging should not be a hardship or require a lot of manual labor
> >
> > * Packaging problems on the master branch should be made known within
> > ~24 hours, so they can be remedied immediately
> >
> > * It should be straightforward to produce binary artifacts for all
> > supported platforms and programming languages
> >
> > Eventually, we should include some binary artifacts in our release
> > votes, but we are pretty far away from suitable automation to make
> > this possible.
> >
> > I don't know any easy solutions, but Apache Arrow has grown widely
> > used enough that I think it's worth our taking the time to plan and
> > execute some solutions to these problems, which I expect to pay
> > dividends in our community's productivity over time.
> >
> > Thanks,
> > Wes
> >
>

Re: Confronting Arrow packaging problems

Reply via email to