Scope
=====

We recently discovered an issue [1] for users of the Flink V2 Iceberg
sink [2], which originates from the Flink runtime in the following
Flink patch releases:

- Flink 1.19.3
- Flink 1.20.2

Notably, any releases prior to these patch releases are _not_
affected. For example, Flink 1.19.2 and Flink 1.20.1 are _not_
affected.

SQL users, by default, are _not_ affected, unless they opted-in to use
this feature [2]. Users who opted in or who use the affected class
`IcebergSink` [3] directly, are affected. Users of the old `FlinkSink`
are _not_ affected.

The issue exists in batch pipelines only. Batch pipelines are Flink
jobs which either use Flink's batch execution mode or Flink's
streaming mode with checkpointing disabled. Stream processing users
who have checkpointing enabled are _not_ affected.

Symptom
=======

In batch pipelines with the above scope, writes to Iceberg tables will
fail to produce a final snapshot when the batch job terminates. This
means that the resulting Iceberg table will not contain the data
previously written to data files. More details can be found in the
associated Flink issue [4].

We've addressed the issue on the Flink development branches for future
releases. We will release the patch releases Flink 1.19.4 and Flink
1.20.3 to address this issue.

Cheers,
Max

[1] https://github.com/apache/iceberg/pull/13714
[2] 
https://github.com/apache/iceberg/blob/8d88846ec46bfca60d73d239e8bb34a740f8f0ce/docs/docs/flink-writes.md?plain=1#L375
[3] 
https://github.com/apache/iceberg/blob/8d88846ec46bfca60d73d239e8bb34a740f8f0ce/flink/v1.20/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergSink.java#L137
[4] https://issues.apache.org/jira/browse/FLINK-38370

Reply via email to