Hi,

I'm not sure how much this change will improve our release
process but I'm OK with this try.

Here are technical blockers for this try:

  * Java packaging: WIP: https://github.com/apache/arrow/pull/9155
    * It takes 10m+.
    * It may be failed because a release manager needs to prepare
      local environment to do this.

  * GLib source archive preparation:
    https://github.com/apache/arrow/blob/master/dev/release/source/build.sh
    * It takes 1m+.
    * It may not be failed because most tasks are done in Docker.
      But it means that a release manager needs to prepare Docker.

There are still some small tasks(*) to build source archive
but they aren't blockers.

(*) https://github.com/apache/arrow/blob/master/dev/release/02-source.sh#L84-L97

We can avoid GLib source archive preparation by dropping
support for GNU Autotools. They are used on CentOS 7 and
Ubuntu 16.04. We can use alternative build system (Meson) on
CentOS 7. We'll drop support for Ubuntu 16.04 soon. (Ubuntu
16.04's EOL is 2021-04.)


> I'll start a new rc, it'll be done in 12 hours

As my past release manager experience, here are time
consumption tasks:

  1. Fixing nightly builds
     * Generally, we always have failure builds.
     * I needed 2~3 days for this.
     * I'm still working on this even when I'm not a release manager.

  2. Build source including Java packages preparation
     * I always failed this with some problems and retried
       multiple times.
     * For example: https://issues.apache.org/jira/browse/ARROW-5764
       [Java] Failed to build document with OpenJDK 11
       (This is not fixed yet.)
     * I can't go to the next step while this task isn't completed.

  3. Building binary packages
     * I just need to wait 1~2 hours.
       * We'll be able to speed up this by using cache such as
         ccache for C++ in Crossbow tasks: 1~2 hours -> 10~20 minutes
     * Generally, this isn't failed because nightly builds are fixed.

  4. Downloading built binary packages and uploading binary packages
     * It takes 1~2 hours because we have many files.

  5. Verifying RC before starting vote
     * I can start source verification while building binary packages.
     * It takes 1~2 hours.
     * Generally, I find some problems and fix them with the first RC.
       * Most problems are caused by outdated verification script.
       * It takes +0.5-1 hour per problem.
       * I'm still working on this even when I'm not a release manager.

This proposal will defer costs of 3., 4. and part of 5.
1. still exists because we can't keep green nightly builds
for now.


> It also solves questions such as "Why should the Rust
> release be blocked just because we're having a problem
> building Python wheels on macOS?"

It solves the question only when the problem is only related
to packaging. If we have a non-packaging problem such as
integration test failure, our release will be blocked.



I sill think that implementing continuous (nightly) release
verification is needed and maintained. If we keep green
release verification, we'll always be able to cut a RC
without problems.


Thanks,
--
kou

In <CAOCv4hg_usTK-4WvNDyRtTEUW6BiS7wtN3s=HOVa=p4cfgb...@mail.gmail.com>
  "[Proposal] Modify release process to vote only on source release" on Tue, 19 
Jan 2021 15:16:20 -0800,
  Neal Richardson <neal.p.richard...@gmail.com> wrote:

> Hi all,
> Over the past year, there's been a lot of discussion around the challenges
> we face as a project in doing releases. Because they are costly to do, we
> don't do them often; because we don't do them often, they become even
> costlier.
> 
> There are only a small number of people (PMC members with GPG keys
> registered with ASF) who could possibly be release manager, and because of
> the amount of time required (I saw Krisztián say on the 3.0 release thread
> something like "I'll start a new rc, it'll be done in 12 hours), even fewer
> people could be expected to take on the burden. Indeed, this is Krisztián's
> 10th release in a row as release manager, and over the course of the
> project, 2/3 of all release candidates have been made by just 2 people.
> 
> I'd like to propose a change to our release procedure: instead of having
> the release candidate vote include Python wheels, Linux system packages, or
> any other binary packages, we should only vote on the source release.
> Binary artifacts would be produced as post-release tasks, using the
> official source release.
> 
> This would greatly reduce the time and effort it takes to produce a release
> candidate--tar, sign, and upload, that's it--and it would remove a bunch of
> points of failure from the release-candidate making process (timeouts, CI
> flakiness, etc.). It would also mean fewer release-blocking issues--we
> still have to fix the packaging builds, but doing so can happen in parallel
> with the verification process. If we found problems in the packaging
> scripts, fixes could either be applied as patch steps to the binary
> artifact build scripts, or if fixes can be produced quickly, we collect
> them and cut another (cheap) release candidate. Right now, our only option
> is the latter, which makes for a slow, stressful release process where
> there are so many places where a simple issue can block the whole release
> or set us back an additional week (a full day to produce a release
> candidate plus another three to vote).
> 
> If we went this direction, we could still choose to vote separately on
> binary packages like wheels, though I'm not sure that's worth the effort.
> Many of the packages that people use (conda, homebrew, CRAN, etc.) are
> already "unofficial" releases because they're packaged by someone else, and
> I don't think the distinction is meaningful to our users.
> 
> To be clear, this doesn't reduce the general maintenance burden of the
> project. We still have to monitor nightly builds, fix packaging scripts
> that break, and deal with CI service interruptions. This change would just
> reduce the burden on the release manager and allow us to spread more
> broadly the costs of packaging and releasing. It also solves questions such
> as "Why should the Rust release be blocked just because we're having a
> problem building Python wheels on macOS?"
> 
> There are also other things we could do that would, on a technical level,
> improve our ability to make releases more efficiently. Andy Grove's change
> in the use of maven in the release process will help, as would a number of
> CI/CD improvements. I view these as complementary to this proposal, which
> is a governance question with technical/logistical implications.
> 
> Thoughts?
> 
> Neal

Reply via email to