This looks good to me On Wed, Sep 11, 2024 at 12:35 AM Steven Wu <stevenz...@gmail.com> wrote:
> > Flink Range distribution for Sinks > > It is already included in Ryan's draft > > > Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0 > > This is still ongoing. There is a blocking issue with FileIOParser on > HadoopFileIO: https://github.com/apache/iceberg/pull/10926 > > > On Tue, Sep 10, 2024 at 10:06 PM Péter Váry <peter.vary.apa...@gmail.com> > wrote: > >> Maybe mention some Flink ongoing tasks, improvements: >> - Flink Range distribution for Sinks >> - Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0 >> - Flink Sink V2 implementation to prepare for Flink 2.0 >> - Flink Table Maintenance (ongoing) >> >> Thanks for preparing this Ryan! >> Peter >> >> >> On Tue, Sep 10, 2024, 23:51 Matt Topol <zotthewiz...@gmail.com> wrote: >> >>> There's one additional point to add for the Go implementation, we >>> implemented file scan planning. It returns the list of file scan tasks >>> needed for a given table, partitions and filter expression. >>> >>> --Matt >>> >>> On Tue, Sep 10, 2024, 5:43 PM rdb...@gmail.com <rdb...@gmail.com> wrote: >>> >>>> Hi everyone, >>>> >>>> It’s time for another ASF board report! Here’s my current draft. Please >>>> reply if you think there is something that I should add or change. Thanks! >>>> >>>> Ryan >>>> Description: >>>> >>>> Apache Iceberg is a table format for huge analytic datasets that is >>>> designed >>>> for high performance and ease of use. >>>> Project Status: >>>> >>>> Current project status: Ongoing >>>> Issues for the board: None >>>> Membership Data: >>>> >>>> Apache Iceberg was founded 2020-05-19 (4 years ago) >>>> There are currently 31 committers and 21 PMC members in this project. >>>> The Committer-to-PMC ratio is roughly 4:3. >>>> >>>> Community changes, past quarter: >>>> >>>> - Amogh Jahagirdar was added to the PMC on 2024-08-12 >>>> - Eduard Tudenhoefner was added to the PMC on 2024-08-12 >>>> - Honah J. was added to the PMC on 2024-07-22 >>>> - Renjie Liu was added to the PMC on 2024-07-22 >>>> - Peter Vary was added to the PMC on 2024-08-12 >>>> - Piotr Findeisen was added as committer on 2024-07-24 >>>> - Kevin Liu was added as committer on 2024-07-24 >>>> - Sung Yun was added as committer on 2024-07-24 >>>> - Hao Ding was added as committer on 2024-07-23 >>>> >>>> Project Activity: >>>> >>>> Releases: >>>> >>>> - Java 1.6.1 was released on 2024-08-28 >>>> - Rust 0.3.0 was released on 2024-08-20 >>>> - PyIceberg 0.7.1 was released on 2024-08-18 >>>> - PyIceberg 0.7.0 was released on 2024-07-30 >>>> - Java 1.6.0 was released on 2024-07-23 >>>> >>>> Table format: >>>> >>>> - Work for v3 is picking up >>>> - Committed timestamp_ns implementation >>>> - Ongoing discussion/proposal for improvements to row-level deletes >>>> - Ongoing discussion/proposal for row-level metadata for change >>>> tracking >>>> - Discussion for adding variant type and where to maintain the spec >>>> (Parquet) >>>> - Making progress on geometry types >>>> - Clarified transform requirements to add transforms as needed (to >>>> support geo) >>>> - Discovered issues affecting new type promotion cases, reduced >>>> scope >>>> >>>> REST protocol specification: >>>> >>>> - Added server-side scan planning >>>> - Support for removing partition specs >>>> - Support for endpoint discovery for future additions >>>> - Clarified failure requirements for unknown actions or validations >>>> >>>> Java: >>>> >>>> - Added classes for v3 table writes >>>> - Fixed rewrites in tables with 1000+ columns >>>> - Added Kafka Connect runtime bundle >>>> - Support for Flink 1.20 >>>> - Added range distribution support in Flink >>>> - Dropped support for Java 8 >>>> >>>> PyIceberg: >>>> >>>> - Discussed adding a dependency on iceberg-rust for native >>>> extensions >>>> - Write support for time and identity transforms >>>> - Parallelized large writes >>>> - Support for deletes using filter predicates >>>> - Staged table creation for atomic CTAS >>>> - Support manifest merging on write >>>> - Better integration with PyArrow to produce lazy readers from scans >>>> - New API to add existing Parquet files >>>> - Support custom catalogs >>>> >>>> Rust: >>>> >>>> - Established subproject pyiceberg_core to support PyIceberg >>>> - Implemented OAuth for catalog REST client >>>> - Added Parquet writer and reader capabilities with support for >>>> data projection. >>>> - Introduced memory catalog and memory file IO support >>>> - Initialized SQL Catalog >>>> - Added support for GCS storage and AWS session tokens >>>> - Implemented concurrent table scans and data file fetching >>>> - Enhanced predicate builders and expression evaluators >>>> - Added support for timestamp columns in row filters >>>> >>>> Go: >>>> >>>> - Implemented expressions and expression visitors >>>> >>>> Community Health: >>>> >>>> Several new committers and PMC members were added this quarter, which >>>> is a good >>>> indicator for community health. There was also a significant number of >>>> threads >>>> on the mailing list about setting expectations for contributors and >>>> clearly >>>> document how the community operates. New guidelines for merging PRs >>>> have been >>>> added to the website and the community is also discussing guidelines >>>> for how >>>> contributors can become committers. This builds on work from last >>>> quarter that >>>> clarified the process for design discussions. >>>> >>>> Many of the topics under discussion were raised because of the >>>> acquisition that >>>> was noted in the last board report. The community has been working to >>>> address >>>> the concerns raised, which are primarily in 3 areas: >>>> >>>> - How decisions are made about designs and commits (now clarified) >>>> - How contributors become committers and PMC members (under >>>> discussion) >>>> - How the community operates when people cannot reach consensus >>>> >>>> The last concern has historically not been a problem; people have so far >>>> chosen to “disagree and commit” when a large majority in the community >>>> has >>>> a different opinion. However, the first instance of this was >>>> encountered near >>>> the end of the quarter. The community and PMC need to discuss how to >>>> make >>>> progress on the issue. >>>> >>>