[DISCUSS] Iceberg board report - March 2024

Ryan Blue Tue, 12 Mar 2024 11:02:37 -0700

Hi everyone,

Here’s my draft for Iceberg’s ASF board report. If you have anything to
add, please reply!


Ryan
Description:

Apache Iceberg is a table format for huge analytic datasets that is designed
for high performance and ease of use.
Project Status:

Current project status: Ongoing
Issues for the board: None
Membership Data:

Apache Iceberg was founded 2020-05-19 (4 years ago)
There are currently 27 committers and 16 PMC members in this project.
The Committer-to-PMC ratio is roughly 7:4.

Community changes, past quarter:

   - No new PMC members. Last addition was Szehon Ho on 2023-04-20.
   - Bryan Keller was added as committer on 2024-03-02
   - Honah J. was added as committer on 2024-01-11
   - Renjie Liu was added as committer on 2024-03-06

Project Activity:

Releases:

   - Java 1.5.0 was released on 2024-03-11
   - Rust 0.2.0 was released on 2024-02-20 (first release!)
   - Python 0.6.0 was released on 2024-02-19
   - Java 1.4.3 was released on 2023-12-27

Java implementation:

   - 1.5.0 is the first release supporting Iceberg Views
   - Added View resolution support in Spark engine integration
   - Added View commands to Spark (SHOW/CREATE/DROP/etc.)
   - View support in Trino is unblocked by the 1.5.0 release
   - Added View support to REST, Nessie, and JDBC catalogs
   - Discussing Materialized View extensions to Iceberg specs
   - Added EncryptingFileIO to minimize encryption-related API changes
   - Added StandardEncryptionManager to implement Iceberg Encryption spec
   - Added Parquet (native) and Avro (AES GCM) encryption support
   - Added pagination to listing in the REST catalog protocol
   - Discussing multiple extensions to the REST protocol (appends, planning)
   - Added delete file cache to Spark
   - Added support for Flink 1.18
   - Removed support for Spark 3.2

Python implementation

   - 0.6.0 is the first release supporting native writes
   - Append and full table overwrite are supported
   - Only writes to unpartitioned tables are supported
   - Added commit support to JDBC, Glue, and Hive catalogs
   - Implemented name mapping support for reading Parquet files without
   field IDs
   - Actively working on writes to partitioned tables and engine integration

Rust implementation:

   - 0.2.0 is the first Rust release
   - Supports reading metadata files
   - Supports REST catalog interaction
   - Scan planning is the next active area of work

Documentation:

   - Switched to new site build in the iceberg repository so contributing
   is easier

Community Health:

The Iceberg community continues to be healthy. Although commit and PR
activity
declined, the metrics indicate that activity was still strong (with 70
contributors and nearly 1,000 commits). This quarter also included holidays
(which usually have decreased activity) and a huge increase in mailing list
traffic (60%) because the community has been having many design discussions
about evolving the REST spec, introducing new specs (materialized views),
and
discussions around how to keep track of new design proposals.

The community also started organizing an Iceberg Summit, to be held May
14-15.
The summit has been cleared by trademarks and the call for proposals has
been
posted. More information can be found at:

   - The Iceberg Summit website: https://iceberg-summit.org/
   - The Call for Proposals: https://sessionize.com/iceberg-summit-2024/

-- 
Ryan Blue

[DISCUSS] Iceberg board report - March 2024

Reply via email to