Re: Flink: uncommitted data files and garbage collection

2024-02-29 Thread Steven Wu
> Maybe the Iceberg sink jobs should exit terminally if it hasn't been able to commit to Iceberg after a threshold (like 24 hours) e.g. due to catalog service outage. This will prevent Flink jobs from producing more data files that can't be committed. Actually, this doesn't help. old uncommitted d

Flink: uncommitted data files and garbage collection

2024-02-29 Thread Steven Wu
We are probably off the topic of the original thread. I am moving the Flink part of the discussion to a new thread/subject. > but the prepared and not yet committed data files are also present in their final place. These data files are also not part of the table yet, and could be removed by the or