There's one additional point to add for the Go implementation, we
implemented file scan planning. It returns the list of file scan tasks
needed for a given table, partitions and filter expression.

--Matt

On Tue, Sep 10, 2024, 5:43 PM rdb...@gmail.com <rdb...@gmail.com> wrote:

> Hi everyone,
>
> It’s time for another ASF board report! Here’s my current draft. Please
> reply if you think there is something that I should add or change. Thanks!
>
> Ryan
> Description:
>
> Apache Iceberg is a table format for huge analytic datasets that is
> designed
> for high performance and ease of use.
> Project Status:
>
> Current project status: Ongoing
> Issues for the board: None
> Membership Data:
>
> Apache Iceberg was founded 2020-05-19 (4 years ago)
> There are currently 31 committers and 21 PMC members in this project.
> The Committer-to-PMC ratio is roughly 4:3.
>
> Community changes, past quarter:
>
>    - Amogh Jahagirdar was added to the PMC on 2024-08-12
>    - Eduard Tudenhoefner was added to the PMC on 2024-08-12
>    - Honah J. was added to the PMC on 2024-07-22
>    - Renjie Liu was added to the PMC on 2024-07-22
>    - Peter Vary was added to the PMC on 2024-08-12
>    - Piotr Findeisen was added as committer on 2024-07-24
>    - Kevin Liu was added as committer on 2024-07-24
>    - Sung Yun was added as committer on 2024-07-24
>    - Hao Ding was added as committer on 2024-07-23
>
> Project Activity:
>
> Releases:
>
>    - Java 1.6.1 was released on 2024-08-28
>    - Rust 0.3.0 was released on 2024-08-20
>    - PyIceberg 0.7.1 was released on 2024-08-18
>    - PyIceberg 0.7.0 was released on 2024-07-30
>    - Java 1.6.0 was released on 2024-07-23
>
> Table format:
>
>    - Work for v3 is picking up
>    - Committed timestamp_ns implementation
>    - Ongoing discussion/proposal for improvements to row-level deletes
>    - Ongoing discussion/proposal for row-level metadata for change
>    tracking
>    - Discussion for adding variant type and where to maintain the spec
>    (Parquet)
>    - Making progress on geometry types
>    - Clarified transform requirements to add transforms as needed (to
>    support geo)
>    - Discovered issues affecting new type promotion cases, reduced scope
>
> REST protocol specification:
>
>    - Added server-side scan planning
>    - Support for removing partition specs
>    - Support for endpoint discovery for future additions
>    - Clarified failure requirements for unknown actions or validations
>
> Java:
>
>    - Added classes for v3 table writes
>    - Fixed rewrites in tables with 1000+ columns
>    - Added Kafka Connect runtime bundle
>    - Support for Flink 1.20
>    - Added range distribution support in Flink
>    - Dropped support for Java 8
>
> PyIceberg:
>
>    - Discussed adding a dependency on iceberg-rust for native extensions
>    - Write support for time and identity transforms
>    - Parallelized large writes
>    - Support for deletes using filter predicates
>    - Staged table creation for atomic CTAS
>    - Support manifest merging on write
>    - Better integration with PyArrow to produce lazy readers from scans
>    - New API to add existing Parquet files
>    - Support custom catalogs
>
> Rust:
>
>    - Established subproject pyiceberg_core to support PyIceberg
>    - Implemented OAuth for catalog REST client
>    - Added Parquet writer and reader capabilities with support for data
>    projection.
>    - Introduced memory catalog and memory file IO support
>    - Initialized SQL Catalog
>    - Added support for GCS storage and AWS session tokens
>    - Implemented concurrent table scans and data file fetching
>    - Enhanced predicate builders and expression evaluators
>    - Added support for timestamp columns in row filters
>
> Go:
>
>    - Implemented expressions and expression visitors
>
> Community Health:
>
> Several new committers and PMC members were added this quarter, which is a
> good
> indicator for community health. There was also a significant number of
> threads
> on the mailing list about setting expectations for contributors and clearly
> document how the community operates. New guidelines for merging PRs have
> been
> added to the website and the community is also discussing guidelines for
> how
> contributors can become committers. This builds on work from last quarter
> that
> clarified the process for design discussions.
>
> Many of the topics under discussion were raised because of the acquisition
> that
> was noted in the last board report. The community has been working to
> address
> the concerns raised, which are primarily in 3 areas:
>
>    - How decisions are made about designs and commits (now clarified)
>    - How contributors become committers and PMC members (under discussion)
>    - How the community operates when people cannot reach consensus
>
> The last concern has historically not been a problem; people have so far
> chosen to “disagree and commit” when a large majority in the community has
> a different opinion. However, the first instance of this was encountered
> near
> the end of the quarter. The community and PMC need to discuss how to make
> progress on the issue.
>

Reply via email to