[Discuss]- Donate Iceberg Flink Connector

Abid Mohammed Sun, 09 Oct 2022 18:22:27 -0700

Hi, 

I would like to start a discussion about contributing Iceberg Flink Connector 
to Flink.

I created a doc
<https://docs.google.com/document/d/1WC8xkPiVdwtsKL2VSPAUgzm9EjrPs8ZRjEtcwv93ISI/edit?usp=sharing>
with all the details following the Flink Connector template as I don’t have
permissions to create a FLIP yet.
High level details are captured below:

Motivation:

This FLIP aims to contribute the existing Apache Iceberg Flink Connector to
Flink.

Apache Iceberg is an open table format for huge analytic datasets. Iceberg adds
tables to compute engines including Spark, Trino, PrestoDB, Flink, Hive and
Impala using a high-performance table format that works just like a SQL table.
Iceberg avoids unpleasant surprises. Schema evolution works and won’t
inadvertently un-delete data. Users don’t need to know about partitioning to
get fast queries. Iceberg was designed to solve correctness problems in
eventually-consistent cloud object stores.

Iceberg supports both Flink’s DataStream API and Table API. Based on the
guideline of the Flink community, only the latest 2 minor versions are actively
maintained. See the Multi-Engine Support#apache-flink for further details.

Iceberg connector supports:

• Source: detailed Source design
<https://docs.google.com/document/d/1q6xaBxUPFwYsW9aXWxYUh7die6O7rDeAPFQcTAMQ0GM/edit#>,
based on FLIP-27
• Sink: detailed Sink design and interfaces used
<https://docs.google.com/document/d/1O-dPaFct59wUWQECXEEYIkl9_MOoG3zTbC2V-fZRwrg/edit#>
• Usable in both DataStream and Table API/SQL
• DataStream read/append/overwrite
• SQL create/alter/drop table, select, insert into, insert overwrite
• Streaming or batch read in Java API
• Support for Flink’s Python API

See Iceberg Flink <https://iceberg.apache.org/docs/latest/flink/#flink>for
detailed usage instructions.

Looking forward to the discussion!

Thanks
Abid

[Discuss]- Donate Iceberg Flink Connector

Reply via email to