Re: [ANNOUNCE] Apache Iceberg release 1.9.2

2025-07-18 Thread Maximilian Michels
Thanks Prashant for managing this release! -Max On Fri, Jul 18, 2025 at 7:42 AM Manu Zhang wrote: > > Thanks Prashant for driving the release. Can you also create a new release at > https://github.com/apache/iceberg/releases and update the links in the > release notes? > > Regards, > Manu > >

Re: [Discuss] Iceberg Blog Posts

2025-07-18 Thread Maximilian Michels
nal" blog posts (we can actually do both). > > Regards > JB > > On Thu, Jul 17, 2025 at 3:07 PM Maximilian Michels wrote: > > > > Hey Robin, > > > > Crossposting can make sense, but the bar for "official posts" from the > > Iceberg co

Re: [Discuss] Iceberg Blog Posts

2025-07-17 Thread Maximilian Michels
tent, but I don't think that is relevant here. > > thanks, Robin. > > On Thu, 17 Jul 2025 at 10:22, Maximilian Michels wrote: >> >> Thanks for the feedback thus far. >> >> @Robin: To your points: >> >> 1. I would keep the existing blog page, but m

Re: [Discuss] Iceberg Blog Posts

2025-07-17 Thread Maximilian Michels
tiste Onofré wrote: >> >> Hi Max >> >> +1 it makes sense to me, especially for posts directly >> presenting/announcing something in the project. >> >> Several Apache projects do that, up to the project. >> >> Regards >> JB >> >> O

[Discuss] Iceberg Blog Posts

2025-07-16 Thread Maximilian Michels
Hi, I noticed that all blog posts on https://iceberg.apache.org/blogs/ are hosted externally. Would it make sense to publish community posts directly on the Iceberg site? Possible topics: - Release announcements - Presenting new features - Deep dives into internals - Use cases For example, I was

Re: [VOTE] Release Apache Iceberg 1.9.2 RC0

2025-07-16 Thread Maximilian Michels
+1 (non-binding) 1. Verified the archive checksum and signature 2. Extracted and inspected the source code for binaries 3. Compiled the source code 4. Verified license files / headers 5. Ran Flink tests -Max On Tue, Jul 15, 2025 at 10:27 PM Yuya Ebihara < yuya.ebih...@starburstdata.com> wrote:

Re: [DISCUSS] V4 - indexing support

2025-07-15 Thread Maximilian Michels
Thanks Steven for the summary. It would be great to extend the Iceberg spec with index files, such that they can be used for the different use cases. For my understanding, let me further outline the different types of use cases for index files: --- Topic 1: Accelerating the resolution of equality

Re: Flink: Current and Future state of the sink connectors

2025-06-30 Thread Maximilian Michels
ounter blockers. On Mon, Jun 30, 2025 at 11:42 AM Péter Váry wrote: > Is this 5-10-100-1000 different use-case we have seen the migration for? > Also, it would be good to see if someone in the community has some > experience too. > > Maximilian Michels ezt írta (időpont: 2025.

Re: Append-only table scans in the presence of OVERWRITE snapshots

2025-06-30 Thread Maximilian Michels
ter Váry wrote: > Minimally LOG.warn message about deprecation. > Maybe a "hidden" flag which could turn back to skip overwrite snapshots. > This flag could be deprecated immediately and removed in the next release. > Maybe wait until 2.0, where we can introduce breaking changes? >

Re: Append-only table scans in the presence of OVERWRITE snapshots

2025-06-30 Thread Maximilian Michels
25. jún. 27., P, 15:28): > >> Sounds good to me! >> >> Gyula >> Sent from my iPhone >> >> On 27 Jun 2025, at 13:48, Maximilian Michels wrote: >> >>  >> In my understanding, overwrite snapshots are specifically there to >> overwrite data

Re: Flink: Current and Future state of the sink connectors

2025-06-30 Thread Maximilian Michels
he > experience there? > > Maximilian Michels ezt írta (időpont: 2025. jún. 27., P, > 13:49): > >> Peter, I was suggesting not to remove earlier, but to switch the default >> sink implementation earlier. Is that what you mean? >> >> -Max >> >> On T

Re: Flink: Current and Future state of the sink connectors

2025-06-27 Thread Maximilian Michels
Peter, I was suggesting not to remove earlier, but to switch the default sink implementation earlier. Is that what you mean? -Max On Thu, Jun 26, 2025 at 5:19 PM Péter Váry wrote: > +1 for the more conservative removal time > > Maximilian Michels ezt írta (időpont: 2025. jún. 26., &g

Re: Append-only table scans in the presence of OVERWRITE snapshots

2025-06-27 Thread Maximilian Michels
gt; >> OVERWRITE snapshots themselves could still contain deletes. So in this >> regard, I don't see a difference between the DELETE and the OVERWRITE >> snapshots. >> >> Maximilian Michels ezt írta (időpont: 2025. jún. 26., >> Cs, 11:03): >> >

Re: Flink: Current and Future state of the sink connectors

2025-06-26 Thread Maximilian Michels
found @stevenzu commit from August 2024 [1] >> which is related to this unit test also hanging and that adding >> `taskInfo.getIndexOfThisSubtask() == 0` was necessary for the unit test to >> pass. >> >> >> I am still digging deeper, but so far I haven’t been

Re: Append-only table scans in the presence of OVERWRITE snapshots

2025-06-26 Thread Maximilian Michels
ion? >> >> > 3. Throw an error if overwrite snapshots are discovered in an >> append-only scan and the option in (2) is not activated >> >> It would be interesting to hear what people think about this. In some >> scenarios, this is probably better. >> >

Append-only table scans in the presence of OVERWRITE snapshots

2025-06-25 Thread Maximilian Michels
Hi, It is well known that Flink and other Iceberg engines do not support "merge-on-read" in streaming/incremental read mode. There are plans to change that, see the "Improving Merge-On-Read Query Performance" thread, but this is not what this message is about. I used to think that when we increme

Re: Flink: Current and Future state of the sink connectors

2025-06-24 Thread Maximilian Michels
+1 Great proposal. What about moving the timeline one release ahead? There isn't any more planned work for IcebergSink between the 1.10.0 and the 1.11.0 release. IcebergSink already got enough exposure and we won't gain anything from waiting longer to make it the default. Thus, I would suggest to

Re: [DISCUSS] Apache Iceberg 1.10.0 release

2025-05-28 Thread Maximilian Michels
Hi Steven, When do you plan to cut the release branch? I would like to get https://github.com/apache/iceberg/pull/12424 in. Thanks, Max On Wed, May 28, 2025 at 11:28 AM Jean-Baptiste Onofré wrote: > > Hi > > I think I have multi-args transforms in good shape to be in the scope > for 1.10.0. Rel

Re: [DISCUSS] Enabling more Meetups

2025-05-28 Thread Maximilian Michels
Hi, Thanks for the proposal Russell! JB also raised some fair points regarding use of trademarks. We can adjust the proposal slightly to include the use of trademarks: """ The Apache Iceberg community supports Apache Iceberg meetups. The name "Apache Iceberg Meetup " may be used in compliance wi

Re: [Announce] Iceberg Meetup Berlin - 3rd of July

2025-05-28 Thread Maximilian Michels
Hi Christian, Thanks for posting the link! Looking forward to the meetup. Cheers, Max On Fri, May 23, 2025 at 8:10 PM Christian Thiel wrote: > > Hey everyone, > > The next Iceberg Meetup Europe is coming to Berlin on July 3rd! > Please register at: https://lu.ma/pposobem > The Call for Speakers

[DISCUSS] Dynamic Schema / Partition Spec Changes

2025-05-21 Thread Maximilian Michels
Hi folks, Imagine a data processing job, which reads from a data source and outputs to Iceberg tables. The catch is that table name, schema / spec, branch, etc. are only available at runtime in the data itself. Many users have worked around this by writing the data to a temporary destination, then

Re: Core changes for Flink Dynamic Iceberg Sink

2025-05-21 Thread Maximilian Michels
h like how SchemaUpdate is itself tested. > > Ryan > > > On Mon, May 19, 2025 at 9:41 AM Maximilian Michels wrote: >> >> Hi, >> >> The Flink Dynamic Iceberg Sink requires a few iceberg-core class >> visibility changes, which I'd like to get your fe

Core changes for Flink Dynamic Iceberg Sink

2025-05-19 Thread Maximilian Michels
Hi, The Flink Dynamic Iceberg Sink requires a few iceberg-core class visibility changes, which I'd like to get your feedback on. Here is the diff: https://github.com/apache/iceberg/pull/13032/commits/1804b0ac4ff97c3c943463725e91a1e24b0f8c44 There are three visibility changes: Change 1: Make Nam

Re: [VOTE] Release Apache Iceberg 1.9.1 RC0

2025-05-19 Thread Maximilian Michels
+1 (non-binding) 1. Verified the archive checksum and signature 2. Extracted and inspected the source code for binaries 3. Compiled and tested the source code 4. Verified license files / headers -Max On Mon, May 19, 2025 at 6:52 AM Daniel Weeks wrote: > > +1 (binding) > > Verified sigs/sums/lic

Re: [Flink] Remove FlinkSink for Flink 2.0

2025-04-11 Thread Maximilian Michels
ink.apache.org/2025/03/24/apache-flink-2.0.0-a-new-era-of-real-time-data-processing/ Based on that, I would suggest removing FlinkSource. We have an agreement on that. Cheers, Max On Fri, Mar 14, 2025 at 3:14 PM Maximilian Michels wrote: > > Thanks everyone for the well-considered discus

Re: [Flink] Remove FlinkSink for Flink 2.0

2025-03-14 Thread Maximilian Michels
to bake and mature. >>>> - Flink sink (for streaming ingestion) is probably a lot more widely used >>>> than Flink source (for batch or streaming read). Hence, it is valuable to >>>> be more conservative in the deprecation plan here. >>>> - we can

Re: [Flink] Remove FlinkSink for Flink 2.0

2025-03-12 Thread Maximilian Michels
fault value, and I'm checking the new >> types now). >> >> Please let me know :) >> >> Thanks, >> Regards >> JB >> >> On Tue, Mar 11, 2025 at 11:23 AM Maximilian Michels wrote: >> > >> > Hi JB, >> > >&g

Re: [Flink] Remove FlinkSink for Flink 2.0

2025-03-11 Thread Maximilian Michels
should be ready soon. >>> >>> >>> >>> On Thu, Mar 6, 2025 at 8:18 AM Jean-Baptiste Onofré >>> wrote: >>>> >>>> Hi Max, >>>> >>>> I guess you are proposing to remove FlinkSink and the corresponding >&g

Re: [Flink] Remove FlinkSink for Flink 2.0

2025-03-06 Thread Maximilian Michels
I forgot to add that we have the same for the Flink readers: 1. FlinkSource (legacy API) 2. IcebergSource (modern API) I would also like to propose to remove FlinkSource. (2) is already the default Flink reader. Thanks, Max On Thu, Mar 6, 2025 at 4:21 PM Maximilian Michels wrote: > &

[Flink] Remove FlinkSink for Flink 2.0

2025-03-06 Thread Maximilian Michels
Hi, Today there are two Flink write connectors in Iceberg: 1. FlinkSink (original sink, based on Flink legacy interfaces) 2. IcebergSink (newer version, based on modern Flink API) In terms of features, (1) is a subset of (2). I'm in the process of adding support for Flink 2.0. The interfaces us

Re: Dynamic Flink Iceberg Sink

2025-02-28 Thread Maximilian Michels
Good news. There is now a pull request available: https://github.com/apache/iceberg/pull/12424 -Max On Thu, Nov 21, 2024 at 4:36 PM Ferenc Csaky wrote: > > Hello devs, > > +1 from my side, as I look things from the Flink perspective. The Flink > mailing list thread Peter > linked in his previou

Re: Dynamic Flink Iceberg Sink

2024-11-13 Thread Maximilian Michels
Thanks for taking the lead on this, Peter! The Dynamic Iceberg Sink is designed to address several challenges with the current Flink Iceberg sink. It offers three main benefits: 1. *Flexibility in Table Writing*: It allows writing to multiple tables, eliminating the 1:1 sink-to-topic restri

Re: [DISCUSS] - Deprecate Equality Deletes

2024-10-31 Thread Maximilian Michels
How would Flink or other engines support Upserts or Deletions without equality deletes? The only option would be to use positional deletes, but that requires to scan all data files to find the correct positions. IMHO the separation between deciding to delete based on an equality field and applying