Iceberg table maintenance

Péter Váry Thu, 28 Mar 2024 11:00:52 -0700

Hi Team,

I am working on adding a possibility to the Flink Iceberg connector to run
maintenance tasks on the Iceberg tables. This will fix the small files
issues and in the long run help compacting the high number of positional
and equality deletes created by Flink tasks writing CDC data to Iceberg
tables without the need of Spark in the infrastructure.


I did some planning, prototyping and currently trying out the solution on a
larger scale.

I put together a document how my current solution looks like:
https://docs.google.com/document/d/16g3vR18mVBy8jbFaLjf2JwAANuYOmIwr15yDDxovdnA/edit?usp=sharing

I would love to hear your thoughts and feedback on this to find a good
final solution.

Thanks,
Peter

Iceberg table maintenance

Reply via email to