Hi Everyone, I propose that we release the following RC as the official PyIceberg 0.5.0 release. A summary of what's included in 0.5.0:
- Add gzip metadata support <https://github.com/apache/iceberg/pull/7984> - PyArrow HDFS support <https://github.com/apache/iceberg/pull/7997> - Support serverless environments (AWS Lambda) <https://github.com/apache/iceberg/pull/8061> - Many fixes around Avro performance (PRs 1 <https://github.com/apache/iceberg/pull/8074>, 2 <https://github.com/apache/iceberg/pull/8075>, 3 <https://github.com/apache/iceberg/pull/8082>, 4 <https://github.com/apache/iceberg/pull/8084>) - Remove the upper bound of PyParsing dependency <https://github.com/apache/iceberg/pull/8116> (blocking a PR in Airflow <https://github.com/apache/airflow/pull/32786>) - Moving the reading of Avro to Cython <https://github.com/apache/iceberg/pull/8134> (10x speed improvement(!)) - Support for the SQLCatalog <https://github.com/apache/iceberg/pull/7921> (JDBC in Java) - Fix support for UUID columns <https://github.com/apache/iceberg/pull/8267> - Support for adding columns <https://github.com/apache/iceberg/pull/8174> - Optimize concurrency <https://github.com/apache/iceberg/pull/8104> (follow up on the Support servless environments) - Bump Pydantic to v2 <https://github.com/apache/iceberg/pull/7782> (improved performance of the JSON (de)serialization) - A lot of bugfixes! The commit ID is 3323281045a72f1156d58c261067469e383fb26d * This corresponds to the tag: pyiceberg-0.5.0rc2 (92600935834bdf77ba37ac361338712713549a77) * https://github.com/apache/iceberg/releases/tag/pyiceberg-0.5.0rc2 * https://github.com/apache/iceberg/tree/3323281045a72f1156d58c261067469e383fb26d The release tarball, signature, and checksums are here: * https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.5.0rc2/ You can find the KEYS file here: * https://dist.apache.org/repos/dist/dev/iceberg/KEYS Convenience binary artifacts are staged on pypi: https://pypi.org/project/pyiceberg/0.5.0rc2/ And can be installed using: pip3 install pyiceberg==0.5.0rc2 Since a lot has changed due to the release of the wheels (binary Python libraries), I've included the following steps to verify the release: curl https://dist.apache.org/repos/dist/dev/iceberg/KEYS -o KEYS gpg --import KEYS svn checkout https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.5.0rc1/ /tmp/pyiceberg/ for name in $(ls /tmp/pyiceberg/pyiceberg-*.whl /tmp/pyiceberg/pyiceberg-*.tar.gz) do gpg --verify ${name}.asc ${name} done cd /tmp/pyiceberg/ for name in $(ls /tmp/pyiceberg/pyiceberg-*.whl.asc.sha512 /tmp/pyiceberg/pyiceberg-*.tar.gz.asc.sha512) do shasum -a 512 --check ${name} done tar xzf pyiceberg-0.5.0.tar.gz cd pyiceberg-0.5.0 ./dev/check-license Please download, verify, and test. Please vote in the next 72 hours. [ ] +1 Release this as PyIceberg 0.5.0 [ ] +0 [ ] -1 Do not release this because... Please consider this my +1, I've checked against the docker-spark-iceberg <https://github.com/tabular-io/docker-spark-iceberg/pull/92> notebook, and did some checks. Kind regards, Fokko Driesprong