Hi, 

I would like to start a discussion about contributing Iceberg Flink Connector 
to Flink. 

I created a doc 
<https://docs.google.com/document/d/1WC8xkPiVdwtsKL2VSPAUgzm9EjrPs8ZRjEtcwv93ISI/edit?usp=sharing>
 with all the details following the Flink Connector template as I don’t have 
permissions to create a FLIP yet.
High level details are captured below:

Motivation:

This FLIP aims to contribute the existing Apache Iceberg Flink Connector to 
Flink. 

Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds 
tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and 
Impala using a high-performance table format that works just like a SQL table. 
Iceberg avoids unpleasant surprises. Schema evolution works and won’t 
inadvertently un-delete data. Users don’t need to know about partitioning to 
get fast queries. Iceberg was designed to solve correctness problems in 
eventually-consistent cloud object stores. 

Iceberg supports both Flink’s DataStream API and Table API. Based on the 
guideline of the Flink community, only the latest 2 minor versions are actively 
maintained. See the Multi-Engine Support#apache-flink for further details.


Iceberg connector supports:

        • Source: detailed Source design 
<https://docs.google.com/document/d/1q6xaBxUPFwYsW9aXWxYUh7die6O7rDeAPFQcTAMQ0GM/edit#>,
 based on FLIP-27
        • Sink: detailed Sink design and interfaces used 
<https://docs.google.com/document/d/1O-dPaFct59wUWQECXEEYIkl9_MOoG3zTbC2V-fZRwrg/edit#>
        • Usable in both DataStream and Table API/SQL
        • DataStream read/append/overwrite
        • SQL create/alter/drop table, select, insert into, insert overwrite
        • Streaming or batch read in Java API 
        • Support for Flink’s Python API

See Iceberg Flink  <https://iceberg.apache.org/docs/latest/flink/#flink>for 
detailed usage instructions.

Looking forward to the discussion!

Thanks
Abid

Reply via email to