> Flink Range distribution for Sinks

It is already included in Ryan's draft

> Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0

This is still ongoing. There is a blocking issue with FileIOParser on
HadoopFileIO: https://github.com/apache/iceberg/pull/10926


On Tue, Sep 10, 2024 at 10:06 PM Péter Váry <peter.vary.apa...@gmail.com>
wrote:

> Maybe mention some Flink ongoing tasks, improvements:
> - Flink Range distribution for Sinks
> - Flink Source V2 improvements and V1 deprecation to prepare for Flink 2.0
> - Flink Sink V2 implementation to prepare for Flink 2.0
> - Flink Table Maintenance (ongoing)
>
> Thanks for preparing this Ryan!
> Peter
>
>
> On Tue, Sep 10, 2024, 23:51 Matt Topol <zotthewiz...@gmail.com> wrote:
>
>> There's one additional point to add for the Go implementation, we
>> implemented file scan planning. It returns the list of file scan tasks
>> needed for a given table, partitions and filter expression.
>>
>> --Matt
>>
>> On Tue, Sep 10, 2024, 5:43 PM rdb...@gmail.com <rdb...@gmail.com> wrote:
>>
>>> Hi everyone,
>>>
>>> It’s time for another ASF board report! Here’s my current draft. Please
>>> reply if you think there is something that I should add or change. Thanks!
>>>
>>> Ryan
>>> Description:
>>>
>>> Apache Iceberg is a table format for huge analytic datasets that is
>>> designed
>>> for high performance and ease of use.
>>> Project Status:
>>>
>>> Current project status: Ongoing
>>> Issues for the board: None
>>> Membership Data:
>>>
>>> Apache Iceberg was founded 2020-05-19 (4 years ago)
>>> There are currently 31 committers and 21 PMC members in this project.
>>> The Committer-to-PMC ratio is roughly 4:3.
>>>
>>> Community changes, past quarter:
>>>
>>>    - Amogh Jahagirdar was added to the PMC on 2024-08-12
>>>    - Eduard Tudenhoefner was added to the PMC on 2024-08-12
>>>    - Honah J. was added to the PMC on 2024-07-22
>>>    - Renjie Liu was added to the PMC on 2024-07-22
>>>    - Peter Vary was added to the PMC on 2024-08-12
>>>    - Piotr Findeisen was added as committer on 2024-07-24
>>>    - Kevin Liu was added as committer on 2024-07-24
>>>    - Sung Yun was added as committer on 2024-07-24
>>>    - Hao Ding was added as committer on 2024-07-23
>>>
>>> Project Activity:
>>>
>>> Releases:
>>>
>>>    - Java 1.6.1 was released on 2024-08-28
>>>    - Rust 0.3.0 was released on 2024-08-20
>>>    - PyIceberg 0.7.1 was released on 2024-08-18
>>>    - PyIceberg 0.7.0 was released on 2024-07-30
>>>    - Java 1.6.0 was released on 2024-07-23
>>>
>>> Table format:
>>>
>>>    - Work for v3 is picking up
>>>    - Committed timestamp_ns implementation
>>>    - Ongoing discussion/proposal for improvements to row-level deletes
>>>    - Ongoing discussion/proposal for row-level metadata for change
>>>    tracking
>>>    - Discussion for adding variant type and where to maintain the spec
>>>    (Parquet)
>>>    - Making progress on geometry types
>>>    - Clarified transform requirements to add transforms as needed (to
>>>    support geo)
>>>    - Discovered issues affecting new type promotion cases, reduced scope
>>>
>>> REST protocol specification:
>>>
>>>    - Added server-side scan planning
>>>    - Support for removing partition specs
>>>    - Support for endpoint discovery for future additions
>>>    - Clarified failure requirements for unknown actions or validations
>>>
>>> Java:
>>>
>>>    - Added classes for v3 table writes
>>>    - Fixed rewrites in tables with 1000+ columns
>>>    - Added Kafka Connect runtime bundle
>>>    - Support for Flink 1.20
>>>    - Added range distribution support in Flink
>>>    - Dropped support for Java 8
>>>
>>> PyIceberg:
>>>
>>>    - Discussed adding a dependency on iceberg-rust for native extensions
>>>    - Write support for time and identity transforms
>>>    - Parallelized large writes
>>>    - Support for deletes using filter predicates
>>>    - Staged table creation for atomic CTAS
>>>    - Support manifest merging on write
>>>    - Better integration with PyArrow to produce lazy readers from scans
>>>    - New API to add existing Parquet files
>>>    - Support custom catalogs
>>>
>>> Rust:
>>>
>>>    - Established subproject pyiceberg_core to support PyIceberg
>>>    - Implemented OAuth for catalog REST client
>>>    - Added Parquet writer and reader capabilities with support for data
>>>    projection.
>>>    - Introduced memory catalog and memory file IO support
>>>    - Initialized SQL Catalog
>>>    - Added support for GCS storage and AWS session tokens
>>>    - Implemented concurrent table scans and data file fetching
>>>    - Enhanced predicate builders and expression evaluators
>>>    - Added support for timestamp columns in row filters
>>>
>>> Go:
>>>
>>>    - Implemented expressions and expression visitors
>>>
>>> Community Health:
>>>
>>> Several new committers and PMC members were added this quarter, which is
>>> a good
>>> indicator for community health. There was also a significant number of
>>> threads
>>> on the mailing list about setting expectations for contributors and
>>> clearly
>>> document how the community operates. New guidelines for merging PRs have
>>> been
>>> added to the website and the community is also discussing guidelines for
>>> how
>>> contributors can become committers. This builds on work from last
>>> quarter that
>>> clarified the process for design discussions.
>>>
>>> Many of the topics under discussion were raised because of the
>>> acquisition that
>>> was noted in the last board report. The community has been working to
>>> address
>>> the concerns raised, which are primarily in 3 areas:
>>>
>>>    - How decisions are made about designs and commits (now clarified)
>>>    - How contributors become committers and PMC members (under
>>>    discussion)
>>>    - How the community operates when people cannot reach consensus
>>>
>>> The last concern has historically not been a problem; people have so far
>>> chosen to “disagree and commit” when a large majority in the community
>>> has
>>> a different opinion. However, the first instance of this was encountered
>>> near
>>> the end of the quarter. The community and PMC need to discuss how to make
>>> progress on the issue.
>>>
>>

Reply via email to