Hi Everyone,

I propose that we release the following RC as the official PyIceberg 0.8.0
release.

The commit ID is 3ccdc44735d70bd3ef6ed18b60b3eba43c4b3b44
<https://github.com/apache/iceberg-python/commit/3ccdc44735d70bd3ef6ed18b60b3eba43c4b3b44>

   -

   This corresponds to the tag: pyiceberg-0.8.0rc2
   (4a7abd0478996547ee68a5ee1847130bc0a45c10)
   -

   https://github.com/apache/iceberg-python/releases/tag/pyiceberg-0.8.0rc2
   -


   
https://github.com/apache/iceberg-python/tree/3ccdc44735d70bd3ef6ed18b60b3eba43c4b3b44

The release tarball, signature, and checksums are here:

   -

   https://dist.apache.org/repos/dist/dev/iceberg/pyiceberg-0.8.0rc2/

You can find the KEYS file here:

   -

   https://downloads.apache.org/iceberg/KEYS

Convenience binary artifacts are staged on pypi:

https://pypi.org/project/pyiceberg/0.8.0rc2/

And can be installed using: pip3 install pyiceberg==0.8.0rc2

Instructions for verifying a release can be found here:

   -

   https://py.iceberg.apache.org/verify-release/

Please download, verify, and test.

High-level Summary

   -

   185
   
<https://github.com/apache/iceberg-python/compare/pyiceberg-0.7.1...pyiceberg-0.8.0rc2>
   new commits
   -

   18 new first-time contributors
   -

   Deprecation Notice
   -

      Deprecated configuration properties: profile_name, region_name,
      aws_access_key_id, aws_secret_access_key, and aws_session_token
      -

      Deprecated functions: to_requested_schema in pyiceberg/io/pyarrow.py
      and add_snapshot and set_ref_snapshot in pyiceberg/table/__init__.py
      -

   Find a detailed list of PRs at
   https://github.com/apache/iceberg-python/releases/tag/pyiceberg-0.8.0rc2
   -

   Highlights
   -

      Documentation improvements
      -

         Improve docstrings, configuration, etc
         -

         Improve the release process; updated “How to Release” and “Verify
         Release” documentation
         -

      General
      -

         Add support for Python 3.12; drop support for Python 3.8; exclude
         Python 3.9.7
         -

         Bump PyArrow to 18.0.0, remove numpy as a hard dependency
         -

         Bump up Iceberg version to 1.6.0 in integration tests
         -

         Updated release and verify release to use KEYS from apache’s
         `dist/release` repo
         -

      Features
      -

         Add metadata tables for data_files and delete_files
         -

         Add list_views and drop_view to Rest catalog
         -

         Add partition MonthTransform
         -

         Support manifest file caching
         -

         Support Hive Metastore High Availability mode
         -

         Add properties to allow configuring small/large pyarrow type on
         read
         -

         Deprecate redundant catalog identifiers in TableIdentifier and
         row_filter expressions
         -

         Update metadata-log for non-rest catalogs
         -

         Add support for boolean expressions and quoted columns in
         row_filter expressions
         -

         Support setting ARN Role and Session name in S3 and Glue
         -

         Support bi-directional union of types (int <> long, float <>
         double)
         -

         Support passing table-token to commit endpoint
         -

         Allow setting write.parquet.row-group-limit and
         write.parquet.page-row-limit
         -

         Deprecate rest.authorization-url in favor of oauth2-server-uri
         -

         Support s3.signer.endpoint
         -

         Add support to configure access delegation header,
         X-Iceberg-Access-Delegation
         -

         Remove initial_change usage in TableUpdates
         -

         Prevent adding duplicate files in the add_files API
         -

         Support fields with . in name
         -

      Bug Fix
      -

         TableResponse metadata_location can be optional
         -

         Abort the whole table transaction if any updates in the
         transaction have failed
         -

         Use appropriate partition spec for delete
         -

         Use self.table_metadata when in transaction
         -

         Accept empty arrays in struct field lookup
         -

         List namespace response in rest catalog with fully qualified
         namespace
         -

         list_tables method in glue catalog now only returns tables,
         instead of views+tables
         -

         Glue and Hive catalog return only Iceberg tables, instead of
         hive+iceberg tables
         -

         Invert case_sensitive logic in StructType
         -

         Fix table_exists behavior in the REST catalog
         -

         Fix bug where reading with to_arrow_batch_reader return more than
         the limit
         -

         PyArrow: Pass in null-mask for StructField
         -

         Fix overwrite when filtering all the data
         -

         Use the correct spec when rewriting existing manifests
         -

         Use historical partition field name
         -

         Fix Position Deletes + row_filter yields less data when the
         DataFile is large
         -

         Allow for missing operation in Snapshot metadata
         -

         Fix tracing existing entries when there are deletes
         -

         Handle Empty RecordBatch within _task_to_record_batches

Please vote in the next 72 hours.
[ ] +1 Release this as PyIceberg 0.8.0
[ ] +0

[ ] -1 Do not release this because...

Best,

Kevin Liu

Reply via email to