On Thu, Jan 21, 2021 at 8:11 AM Sutou Kouhei <k...@clear-code.com> wrote:
>
> Hi,
>
> I'm not sure how much this change will improve our release
> process but I'm OK with this try.
>
> Here are technical blockers for this try:
>
>   * Java packaging: WIP: https://github.com/apache/arrow/pull/9155
>     * It takes 10m+.
>     * It may be failed because a release manager needs to prepare
>       local environment to do this.
Preferably we should dockerize this step as well.
>
>   * GLib source archive preparation:
>     https://github.com/apache/arrow/blob/master/dev/release/source/build.sh
>     * It takes 1m+.
>     * It may not be failed because most tasks are done in Docker.
>       But it means that a release manager needs to prepare Docker.
I had multiple failures during this step before containerization,
since then it never fails.
>
> There are still some small tasks(*) to build source archive
> but they aren't blockers.
>
> (*) 
> https://github.com/apache/arrow/blob/master/dev/release/02-source.sh#L84-L97
>
> We can avoid GLib source archive preparation by dropping
> support for GNU Autotools. They are used on CentOS 7 and
> Ubuntu 16.04. We can use alternative build system (Meson) on
> CentOS 7. We'll drop support for Ubuntu 16.04 soon. (Ubuntu
> 16.04's EOL is 2021-04.)
>
>
> > I'll start a new rc, it'll be done in 12 hours
>
> As my past release manager experience, here are time
> consumption tasks:
>
>   1. Fixing nightly builds
>      * Generally, we always have failure builds.
>      * I needed 2~3 days for this.
>      * I'm still working on this even when I'm not a release manager.
>
>   2. Build source including Java packages preparation
>      * I always failed this with some problems and retried
>        multiple times.
I experienced the same and each iteration takes 10+ minutes.
>      * For example: https://issues.apache.org/jira/browse/ARROW-5764
>        [Java] Failed to build document with OpenJDK 11
>        (This is not fixed yet.)
>      * I can't go to the next step while this task isn't completed.
>
>   3. Building binary packages
>      * I just need to wait 1~2 hours.
It usually took around 3 hours. Appveyor was the slowest component
here because it offered no parallelization, so we had to wait 4 wheel
builds each taking around 50 minutes.
This is the first release where we build the windows wheels on github
actions, now the overall time to build the binaries is just a bit
above one hour.
>        * We'll be able to speed up this by using cache such as
>          ccache for C++ in Crossbow tasks: 1~2 hours -> 10~20 minutes
We always create new branches, so it would require tricky workaround
to utilize github actions cache plugin, see the cache scope at
https://github.com/actions/cache#cache-scopes
>      * Generally, this isn't failed because nightly builds are fixed.
>
>   4. Downloading built binary packages and uploading binary packages
>      * It takes 1~2 hours because we have many files.
Downloading takes 10-15 minutes on a 500Mbit/s network with a single thread.
I tried to parallelize it before, but quickly hit the github api abuse
limit, see 
https://docs.github.com/en/rest/overview/resources-in-the-rest-api#abuse-rate-limits

Uploading binaries is the slowest part of the process, it takes around
2 hours despite that we upload the binaries concurrently. Bintray also
tends to reject requests so I need to restart the uploading script
multiple times before completion. Occasionally I switch to cellular
network to make the uploading process slower but more stable.
>
>   5. Verifying RC before starting vote
>      * I can start source verification while building binary packages.
>      * It takes 1~2 hours.
>      * Generally, I find some problems and fix them with the first RC.
>        * Most problems are caused by outdated verification script.
>        * It takes +0.5-1 hour per problem.
>        * I'm still working on this even when I'm not a release manager.
This caused the current release to take more time.
>
> This proposal will defer costs of 3., 4. and part of 5.
> 1. still exists because we can't keep green nightly builds
> for now.
>
>
> > It also solves questions such as "Why should the Rust
> > release be blocked just because we're having a problem
> > building Python wheels on macOS?"
>
> It solves the question only when the problem is only related
> to packaging. If we have a non-packaging problem such as
> integration test failure, our release will be blocked.
>
>
>
> I sill think that implementing continuous (nightly) release
> verification is needed and maintained. If we keep green
> release verification, we'll always be able to cut a RC
> without problems.
I would like this approach more. If we could simulate the release
process and its verification in a nightly bases then we shouldn't have
any major surprises.
>
>
> Thanks,
> --
> kou
>
> In <CAOCv4hg_usTK-4WvNDyRtTEUW6BiS7wtN3s=HOVa=p4cfgb...@mail.gmail.com>
>   "[Proposal] Modify release process to vote only on source release" on Tue, 
> 19 Jan 2021 15:16:20 -0800,
>   Neal Richardson <neal.p.richard...@gmail.com> wrote:
>
> > Hi all,
> > Over the past year, there's been a lot of discussion around the challenges
> > we face as a project in doing releases. Because they are costly to do, we
> > don't do them often; because we don't do them often, they become even
> > costlier.
> >
> > There are only a small number of people (PMC members with GPG keys
> > registered with ASF) who could possibly be release manager, and because of
> > the amount of time required (I saw Krisztián say on the 3.0 release thread
> > something like "I'll start a new rc, it'll be done in 12 hours), even fewer
> > people could be expected to take on the burden. Indeed, this is Krisztián's
> > 10th release in a row as release manager, and over the course of the
> > project, 2/3 of all release candidates have been made by just 2 people.
> >
> > I'd like to propose a change to our release procedure: instead of having
> > the release candidate vote include Python wheels, Linux system packages, or
> > any other binary packages, we should only vote on the source release.
> > Binary artifacts would be produced as post-release tasks, using the
> > official source release.
> >
> > This would greatly reduce the time and effort it takes to produce a release
> > candidate--tar, sign, and upload, that's it--and it would remove a bunch of
> > points of failure from the release-candidate making process (timeouts, CI
> > flakiness, etc.). It would also mean fewer release-blocking issues--we
> > still have to fix the packaging builds, but doing so can happen in parallel
> > with the verification process. If we found problems in the packaging
> > scripts, fixes could either be applied as patch steps to the binary
> > artifact build scripts, or if fixes can be produced quickly, we collect
> > them and cut another (cheap) release candidate. Right now, our only option
> > is the latter, which makes for a slow, stressful release process where
> > there are so many places where a simple issue can block the whole release
> > or set us back an additional week (a full day to produce a release
> > candidate plus another three to vote).
> >
> > If we went this direction, we could still choose to vote separately on
> > binary packages like wheels, though I'm not sure that's worth the effort.
> > Many of the packages that people use (conda, homebrew, CRAN, etc.) are
> > already "unofficial" releases because they're packaged by someone else, and
> > I don't think the distinction is meaningful to our users.
> >
> > To be clear, this doesn't reduce the general maintenance burden of the
> > project. We still have to monitor nightly builds, fix packaging scripts
> > that break, and deal with CI service interruptions. This change would just
> > reduce the burden on the release manager and allow us to spread more
> > broadly the costs of packaging and releasing. It also solves questions such
> > as "Why should the Rust release be blocked just because we're having a
> > problem building Python wheels on macOS?"
> >
> > There are also other things we could do that would, on a technical level,
> > improve our ability to make releases more efficiently. Andy Grove's change
> > in the use of maven in the release process will help, as would a number of
> > CI/CD improvements. I view these as complementary to this proposal, which
> > is a governance question with technical/logistical implications.
> >
> > Thoughts?
> >
> > Neal

Reply via email to