Maybe mention some Flink ongoing tasks, improvements:
- Flink Range distribution for Sinks
- Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0
- Flink Sink V2 implementation to prepare for Flink 2.0
- Flink Table Maintenance (ongoing)

Thanks for preparing this Ryan!
Peter


On Tue, Sep 10, 2024, 23:51 Matt Topol <zotthewiz...@gmail.com> wrote:

> There's one additional point to add for the Go implementation, we
> implemented file scan planning. It returns the list of file scan tasks
> needed for a given table, partitions and filter expression.
>
> --Matt
>
> On Tue, Sep 10, 2024, 5:43 PM rdb...@gmail.com <rdb...@gmail.com> wrote:
>
>> Hi everyone,
>>
>> It’s time for another ASF board report! Here’s my current draft. Please
>> reply if you think there is something that I should add or change. Thanks!
>>
>> Ryan
>> Description:
>>
>> Apache Iceberg is a table format for huge analytic datasets that is
>> designed
>> for high performance and ease of use.
>> Project Status:
>>
>> Current project status: Ongoing
>> Issues for the board: None
>> Membership Data:
>>
>> Apache Iceberg was founded 2020-05-19 (4 years ago)
>> There are currently 31 committers and 21 PMC members in this project.
>> The Committer-to-PMC ratio is roughly 4:3.
>>
>> Community changes, past quarter:
>>
>>    - Amogh Jahagirdar was added to the PMC on 2024-08-12
>>    - Eduard Tudenhoefner was added to the PMC on 2024-08-12
>>    - Honah J. was added to the PMC on 2024-07-22
>>    - Renjie Liu was added to the PMC on 2024-07-22
>>    - Peter Vary was added to the PMC on 2024-08-12
>>    - Piotr Findeisen was added as committer on 2024-07-24
>>    - Kevin Liu was added as committer on 2024-07-24
>>    - Sung Yun was added as committer on 2024-07-24
>>    - Hao Ding was added as committer on 2024-07-23
>>
>> Project Activity:
>>
>> Releases:
>>
>>    - Java 1.6.1 was released on 2024-08-28
>>    - Rust 0.3.0 was released on 2024-08-20
>>    - PyIceberg 0.7.1 was released on 2024-08-18
>>    - PyIceberg 0.7.0 was released on 2024-07-30
>>    - Java 1.6.0 was released on 2024-07-23
>>
>> Table format:
>>
>>    - Work for v3 is picking up
>>    - Committed timestamp_ns implementation
>>    - Ongoing discussion/proposal for improvements to row-level deletes
>>    - Ongoing discussion/proposal for row-level metadata for change
>>    tracking
>>    - Discussion for adding variant type and where to maintain the spec
>>    (Parquet)
>>    - Making progress on geometry types
>>    - Clarified transform requirements to add transforms as needed (to
>>    support geo)
>>    - Discovered issues affecting new type promotion cases, reduced scope
>>
>> REST protocol specification:
>>
>>    - Added server-side scan planning
>>    - Support for removing partition specs
>>    - Support for endpoint discovery for future additions
>>    - Clarified failure requirements for unknown actions or validations
>>
>> Java:
>>
>>    - Added classes for v3 table writes
>>    - Fixed rewrites in tables with 1000+ columns
>>    - Added Kafka Connect runtime bundle
>>    - Support for Flink 1.20
>>    - Added range distribution support in Flink
>>    - Dropped support for Java 8
>>
>> PyIceberg:
>>
>>    - Discussed adding a dependency on iceberg-rust for native extensions
>>    - Write support for time and identity transforms
>>    - Parallelized large writes
>>    - Support for deletes using filter predicates
>>    - Staged table creation for atomic CTAS
>>    - Support manifest merging on write
>>    - Better integration with PyArrow to produce lazy readers from scans
>>    - New API to add existing Parquet files
>>    - Support custom catalogs
>>
>> Rust:
>>
>>    - Established subproject pyiceberg_core to support PyIceberg
>>    - Implemented OAuth for catalog REST client
>>    - Added Parquet writer and reader capabilities with support for data
>>    projection.
>>    - Introduced memory catalog and memory file IO support
>>    - Initialized SQL Catalog
>>    - Added support for GCS storage and AWS session tokens
>>    - Implemented concurrent table scans and data file fetching
>>    - Enhanced predicate builders and expression evaluators
>>    - Added support for timestamp columns in row filters
>>
>> Go:
>>
>>    - Implemented expressions and expression visitors
>>
>> Community Health:
>>
>> Several new committers and PMC members were added this quarter, which is
>> a good
>> indicator for community health. There was also a significant number of
>> threads
>> on the mailing list about setting expectations for contributors and
>> clearly
>> document how the community operates. New guidelines for merging PRs have
>> been
>> added to the website and the community is also discussing guidelines for
>> how
>> contributors can become committers. This builds on work from last quarter
>> that
>> clarified the process for design discussions.
>>
>> Many of the topics under discussion were raised because of the
>> acquisition that
>> was noted in the last board report. The community has been working to
>> address
>> the concerns raised, which are primarily in 3 areas:
>>
>>    - How decisions are made about designs and commits (now clarified)
>>    - How contributors become committers and PMC members (under
>>    discussion)
>>    - How the community operates when people cannot reach consensus
>>
>> The last concern has historically not been a problem; people have so far
>> chosen to “disagree and commit” when a large majority in the community has
>> a different opinion. However, the first instance of this was encountered
>> near
>> the end of the quarter. The community and PMC need to discuss how to make
>> progress on the issue.
>>
>

Reply via email to