Re: Dynamic Flink Iceberg Sink

Jean-Baptiste Onofré Wed, 13 Nov 2024 05:06:21 -0800

Thanks for the proposal!
I will take a look asap.

Regards
JB


On Mon, Nov 11, 2024 at 6:32 PM Péter Váry <peter.vary.apa...@gmail.com> wrote:
>
> Hi Team,
>
> With Max Michels, we started to work on enhancing the current Iceberg Sink to 
> allow inserting evolving records into a changing table.
> See: 
> https://docs.google.com/document/d/1R3NZmi65S4lwnmNjH4gLCuXZbgvZV5GNrQKJ5NYdO9s
> Created the project to follow the lifecycle of the proposal: 
> https://github.com/orgs/apache/projects/429
>
> From the abstract:
> ---------
> Flink Iceberg connector sink is the tool to write data to an Iceberg table 
> from a continuous Flink stream. The current Sink implementations emphasize 
> throughput over flexibility. The main limiting factor is that the Iceberg 
> Flink Sink requires static table structure. The table, the schema, the 
> partitioning specification need to be constant. If one of the previous things 
> changes the Flink Job needs to be restarted. This allows using optimal record 
> serialization and good performance, but real life use-cases need to work 
> around this limitation when the underlying table has changed. We need to 
> provide a tool to accommodate these changes.
> [..]
> The following typical use cases are considered during this design:
> - Incoming Avro records schema changes (new columns are added, or other 
> backward compatible changes happen). The Flink job is expected to update the 
> table schema dynamically, and continue to ingest data with the new and the 
> old schema without a job restart.
> - Incoming records define the target Iceberg table dynamically. The Flink job 
> is expected to create the new table(s) and continue writing to them without a 
> job restart.
> - The partitioning schema of the table changes. The Flink job is expected to 
> update the specification and continue writing to the target table without a 
> job restart.
> ---------
>
> If you have any questions, ideas, suggestions please let us know here, or in 
> comments on the document.
>
> Thanks,
> Peter

Re: Dynamic Flink Iceberg Sink

Reply via email to