Hello Max, Peter, Happy new year. Could you share the repository, even if it's not fully complete, so I can extend it further for my POC? Thanks On Tuesday, December 3, 2024 at 02:01:46 AM PST, Anil Dasari <dasaria...@myyahoo.com> wrote: Hi Max,That’s great to hear—thank you for the update. I look forward to reviewing the code when it’s available.
Thanks,Anil On Thursday, November 28, 2024 at 04:16:32 AM PST, Maximilian Michels <m...@apache.org> wrote: Hi Anil, Great. The Dynamic Iceberg Sink will address your feature request. We'll soon share the code for review. Thanks, Max On Wed, Nov 20, 2024 at 11:16 PM Anil Dasari <dasaria...@myyahoo.com.invalid> wrote: > Hi Peter, is there a repository we can begin using for testing or > contributing? Thanks. > > +1 on the feature. i have raised similar request to iceberg community a > month ago - Add support for multiple table DataStream FlinkSink · Issue > #11436 · apache/iceberg > > | > | > | > | | | > > | > > | > | > | | > Add support for multiple table DataStream FlinkSink · Issue #11436 · apa... > > Feature Request / Improvement We're currently using FlinkCDC to create an > Iceberg database from a source Postgre... > | > > | > > | > > > > Thanks > On Tuesday, November 19, 2024 at 01:12:38 AM PST, ConradJam < > czy...@apache.org> wrote: > > +1 I have been trying to complete the Flink CDC Iceberg pipeline recently, > and this feature has been helpful for my work > > Ferenc Csaky <ferenc.cs...@pm.me.invalid> 于2024年11月14日周四 21:16写道: > > > Hi Peter, Max, > > > > Thanks for starting this discussion, +1 for the proposal. I agree > > that the added functionality would solve some major pain points for > > Flink -> Iceberg use-cases! > > > > Best, > > Ferenc > > > > > > > > > > On Thursday, November 14th, 2024 at 03:54, Congxian Qiu < > > qcx978132...@gmail.com> wrote: > > > > > > > > > > > Thanks for kicking off this discussion, +1 for this proposal. > > > > > > Currently, we use FlinkCDC as the ingestion tool to write to Iceberg, > and > > > need to restart the job if there is any schema evolution, Look forward > to > > > seeing this integrated with FlinkCDC > > > > > > Best, > > > Congxian > > > > > > > > > Leonard Xu xbjt...@gmail.com 于2024年11月14日周四 09:37写道: > > > > > > > Thanks Peter and Max for kicking off this discussion, +1 for this > > proposal. > > > > > > > > Dynamic Sink is a key improvement for Iceberg Sink, and it can be > well > > > > integrated with Flink CDC’s pipeline sink in the future as well. > > > > > > > > Best, > > > > Leonard > > > > > > > > > 2024年11月13日 下午7:57,Maximilian Michels m...@apache.org 写道: > > > > > > > > > > Thanks for driving this Peter! The Dynamic Iceberg Sink is meant to > > solve > > > > > several pain points of the current Flink Iceberg sink. Note that > the > > > > > Iceberg sink lives in Apache Iceberg, not Apache Flink, although > > that may > > > > > change one day when the interfaces are stable enough. > > > > > > > > > > From the user perspective, there are three main advantages to using > > the > > > > > Dynamic Iceberg sink: > > > > > > > > > > 1. It supports writing to any number of tables (No more 1:1 > > sink/topic > > > > > relationship). > > > > > 2. It can dynamically create and update tables based on the > > user-supplied > > > > > routing. > > > > > 3. It can dynamically update the schema and partitioning of tables > > based > > > > > on > > > > > the user-supplied specification. > > > > > > > > > > I'd say these features will have a big impact on Flink Iceberg > users. > > > > > > > > > > Cheers, > > > > > Max > > > > > > > > > > On Wed, Nov 13, 2024 at 12:27 PM Péter Váry < > > peter.vary.apa...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > Hi Team, With Max Michels, we started to work on enhancing the > > current > > > > > > Iceberg Sink to allow inserting evolving records into a changing > > table. > > > > > > > > > > > > Created the design doc [1] > > > > > > Created the Iceberg proposal [2] > > > > > > Started the conversation on the Iceberg dev list [3] > > > > > > > > > > > > From the abstract: --------- Flink Iceberg connector sink is the > > tool to > > > > > > write data to an Iceberg table from a continuous Flink stream. > The > > > > > > current > > > > > > Sink implementations emphasize throughput over flexibility. The > > main > > > > > > limiting factor is that the Iceberg Flink Sink requires static > > table > > > > > > structure. The table, the schema, the partitioning specification > > need > > > > > > to be > > > > > > constant. If one of the previous things changes the Flink Job > > needs to > > > > > > be > > > > > > restarted. This allows using optimal record serialization and > good > > > > > > performance, but real life use-cases need to work around this > > limitation > > > > > > when the underlying table has changed. We need to provide a tool > to > > > > > > accommodate these changes. [..] The following typical use cases > are > > > > > > considered during this design: - Incoming Avro records schema > > changes > > > > > > (new > > > > > > columns are added, or other backward compatible changes happen). > > The > > > > > > Flink > > > > > > job is expected to update the table schema dynamically, and > > continue to > > > > > > ingest data with the new and the old schema without a job > restart. > > - > > > > > > Incoming records define the target Iceberg table dynamically. The > > Flink > > > > > > job > > > > > > is expected to create the new table(s) and continue writing to > them > > > > > > without > > > > > > a job restart. - The partitioning schema of the table changes. > The > > Flink > > > > > > job is expected to update the specification and continue writing > > to the > > > > > > target table without a job restart. --------- If you have any > > questions, > > > > > > ideas, suggestions please let us know on any of the email > threads, > > or in > > > > > > comments on the document. Thanks, Peter > > > > > > > > > > > > [1] - > > > > > > > > > > > https://docs.google.com/document/d/1R3NZmi65S4lwnmNjH4gLCuXZbgvZV5GNrQKJ5NYdO9s > > > > > > > > > > [2] - https://github.com/orgs/apache/projects/429 > > > > > > [3] - > > https://lists.apache.org/thread/0sbosd8rzbl5rhqs9dtt99xr44kld3qp > > >