Hi all,
We are now "just one click away" from being able to have automated
releases of Apache Polaris. That click is hitting the merge-button on
[1].
A big thank you to Pierre, for the hard work on this!
It is by far not an easy task to get any release automation in a more
complex project set up, because there is just no way to fully test
such release processes. There are many little devils along the way
that you may already know or who may suddenly say "hi" to you.
>From a technical PoV, proposing a release and the actual release have
to be two separate steps for Apache projects. Details about the Apache
process in general can be found here [2], project specific details
here [3].
TL;DR "manual" releases require quite a few steps that all have to be
done 100% correctly. If not done correctly, start over.
The release automation is simplified to four GitHub workflows that
just require the really mandatory user input: the version number - for
example "1.3".
1. workflow: Trigger the creation of the release branch
2. workflow: Upgrade the release branch with the version and build the
the final change-log for that version
3. workflow: Build the RC artifacts from the release branch and push
those to the various staging repositories
4. workflow: Eventually release the artifacts.
The automation is also aware of "retrying" an RC - so RC2, RC3, RC4 -
how many might be needed. Just rerun steps 2 + 3.
And the workflows also automatically increment the patch numbers, the
previous tag on that branch has already been released. Just rerun
steps 2 + 3.
Every Apache Polaris committer will be able to run those workflows.
The technical release candidate verification is also a quite manual
and time consuming effort. Checking the correctness of all the GPG
signatures and checksums, looking for the mandatory files, etc takes
quite some time, if you don't have any automation around this This is
where [4] comes into play. That script, fed with the Git commit and
the version and RC number, does (hopefully all) the technical
verification, including a verification that every binary built locally
from the Git commit matches exactly ("bit by bit") the artifacts
proposed in the RC ("reproducible builds"). It is intentionally a bash
script so everybody can look into it and follow what it's doing - and
it is intentionally documented to always download it from the Polaris
Github repository, so you always use the "right" version.
However, there are two (known) build parts that are not (necessarily)
reproducible [5], but we will eventually have those resolved
soon(-ish).
What's next?
We also want to release the Polaris CLI and the Python SDK [6]. This
is going to be integrated into the automated release workflows but
also in the verification script. Thanks Artur and Honah for help
solving these essential pieces of the whole release puzzle!
Then there will be Apache Trusted Releases (ATR), which is a "service
for verifying and distributing Apache releases securely". _Securely_
is the last but IMO most important keyword there. Supply chain attacks
have become a serious risk and that just has to be tackled.
The change to use ATR will (very likely) not change the work of the
release manager in any way. It will be rather "just" a change to the
workflow implementation, but not to its user interface.
And there are (all the) Polaris tools. As every tool is different and
has its own special needs, we have to adapt the automation work for
the tools to be released.
There are likely more improvements coming to the whole release
automation topic, but this is (I think) pretty much it.
Robert
[1] https://github.com/apache/polaris/pull/2383
[2] https://infra.apache.org/release-publishing.html
[3] https://polaris.apache.org/community/release-guide/
[4] https://github.com/apache/polaris/pull/2824
[5] https://github.com/apache/polaris/issues/2204
[6] https://github.com/apache/polaris/pull/3036