Hi Fokko,

Thank you for pointing me to the ASF Release Policy, it was very
informative!

Based on the policies, we cannot automate signing in GitHub Actions because
Python wheels are not considered reproducible [1]. The wheels include
information about the current date and time, which prevents validation with
checksums. While there might be ways to create reproducible wheels [2], I
don’t think it’s necessary at this time.

The policy around trusted hardware is interesting. This means the release
manager must download the artifacts and verify them locally before
publishing.
> hardware owned and controlled by the committer. That means hardware the
committer has physical possession and control of and exclusively full
administrative/superuser access to.

Given the above, the steps for "Create a Release Candidate" should be as
follows:
* Create and push a tag for the Release Candidate (e.g., `0.8.1rc1`)
* The tag push triggers a GitHub Action workflow that builds artifacts for
both SVN and PyPI
* The Release Manager downloads the SVN and PyPI artifacts locally
* The Release Manager generates SHA-512 checksums and GPG signatures
locally for the SVN artifacts and uploads them to SVN.
* Release Manager uploads the PyPi artifacts to PyPi

The GitHub Action will no longer create checksum or signature files.
Instead, the Release Manager will need to set up the required SVN and GPG
infrastructure locally.

I will change the release process accordingly in PR #1391
<https://github.com/apache/iceberg-python/pull/1391> and update the "how to
release" documentation as well.

Regarding the Nightly Builds, we can use a GitHub Action to build and push
artifacts to PyPI. The "build" step is already in place; we just need a
PyPI API key to push the artifacts. I can generate a new PyPI token using
my own PyPI account, or we could request one from ASF Infra.

Thanks again for your help!

Best,
Kevin Liu

[1] https://reproducible-builds.org/
[2] https://github.com/wimglenn/setuptools-reproducible

On Tue, Dec 3, 2024 at 1:31 AM Fokko Driesprong <fo...@apache.org> wrote:

> Hey Kevin,
>
> First of all, thanks for working on the releases, that's always much
> appreciated.
>
> Regarding the changes to the release process, I'm all for automating as
> much as possible, but I have some concerns. I also think it is important to
> split out nightly builds, and the release process in general.
>
> Releases
>
> Concerning the releases, I think the official ASF release policy
> <https://www.apache.org/legal/release-policy.html#artifacts> is a very
> good read. While reading up on this topic, I noticed that it is allowed to 
> have
> automated signing of the release
> <https://infra.apache.org/release-signing.html#automated-release-signing>,
> but this comes with some prerequisites, such as having reproducible builds.
> This is not the case for PyIceberg today, ie. if you build wheels twice,
> they have different checksums. Also, a manual validation step is still part
> of the process, where all artifacts are produced on trusted hardware
> <https://www.apache.org/legal/release-policy.html#owned-controlled-hardware> 
> before
> publication.
>
> I would lean much more towards the way Iceberg-Go has solved this
> <https://github.com/apache/iceberg-go/blob/main/dev/release/release_rc.sh>.
> It creates a tag locally and pushes it to the repository, the tag triggers
> a Github Action workflow, generating the required artifacts and the
> convenience artifacts. Those are downloaded to the local machine,
> signature, and checksum are added, and pushed back to GitHub Actions and
> SVN.
>
> Nightly Builds
>
> As also stated in the release policy. We can provide nightly builds for
> the development community, but as a project, we should direct outsiders
> toward the official releases. Since a nightly build is not an official
> release
> <https://repository.apache.org/content/groups/snapshots/org/apache/iceberg/>,
> it doesn't have to go through the whole process including signatures and
> checksums, as long it is sufficiently hidden from the end-users and
> intended for developers only. We could, for example, push a nightly build
> to test.pypi.org <https://test.pypi.org/project/pyiceberg/>.
>
> Thanks again for working on this. Hope this helps, and let me know what
> you think!
>
> Kind regards,
> Fokko
>
>
> Op ma 2 dec 2024 om 20:38 schreef Kevin Liu <kevinjq...@apache.org>:
>
>> Hi everyone,
>>
>> As the release manager for PyIceberg 0.8.0 and the upcoming 0.8.1
>> release, I’ve taken some time to reflect on ways we could improve the
>> release process. I drew inspiration from the iceberg-go release process and
>> documented my notes here
>> <https://github.com/apache/iceberg-python/issues/1306>. I’ve also
>> updated the release instructions here
>> <https://py.iceberg.apache.org/how-to-release/>.
>>
>> Currently, the release process is manual and prone to errors. My goal is
>> to automate it as much as possible, ideally transforming it into a
>> single-click process.
>>
>> I’d like to gather your thoughts on two key ideas:
>>
>>    1. Automating the release process to reduce manual steps and errors.
>>    2. Introducing nightly builds to PyPI once automation is in place (issue
>>    #872 <https://github.com/apache/iceberg-python/issues/872>).
>>
>> The PyIceberg release process can be summarized in these steps:
>>
>>    - Create a Release Candidate (RC)
>>    - Vote on the devlist
>>    - Promote the RC to a Final Release
>>
>> I believe the *"*Create a Release Candidate*"* step can benefit the most
>> from automation. Here’s a breakdown of the current steps:
>>
>>    - Create a tag for the Release Candidate (e.g., `0.8.1rc1`).
>>    - Generate artifacts (currently done using GitHub Actions).
>>    - Generate SHA-512 checksums and GPG signatures, then upload the
>>    artifacts to SVN.
>>    - Upload the artifacts to PyPI.
>>
>> To automate these steps via GitHub Actions, we’d need to address the
>> following:
>>
>>    - *GPG Signing*: GitHub Actions require a `GPG_PRIVATE_KEY` secret.
>>    I’ve tested this with my own key, but it would be better to create a new
>>    key (possibly owned by ASF) for signing files.
>>    - *SVN Uploads*: Uploading artifacts to SVN requires credentials. I
>>    haven’t tested this step yet, but we should aim to use credentials 
>> provided
>>    by ASF Infra instead of personal ones.
>>    - *PyPI Uploads*: Similarly, uploading to PyPI requires an API token,
>>    which should ideally be provided by ASF Infra.
>>
>> I’ve begun automating the artifact generation process (PR #1391
>> <https://github.com/apache/iceberg-python/pull/1391>). However, the
>> release manager currently still needs to manually download and upload
>> artifacts to both SVN and PyPI.
>>
>> Once the "Create a Release Candidate" step is automated, we can create a
>> GitHub Action to manually build and upload a nightly version to PyPi.
>>
>>
>> *Is this the direction we want to take for the release process? If so,
>> what’s the best way to coordinate with ASF Infra to create the necessary
>> credentials?*
>>
>> I’d love to hear your thoughts and any additional suggestions.
>> Best,
>> Kevin Liu
>>
>>

Reply via email to