Hi there, I'm working on developing a new version of the fully open-source Delta-Flink Sink using the Unified Flink Sink V2 APIs and the fully open-source Delta-Kernel library (link <https://delta.io/blog/delta-kernel/> for more details about Delta-Kernel).
I have a few questions about the API guarantees provided by Flink's exactly-once guarantees. For the following questions, you can assume I'm forcing a single global committer using the ` org.apache.flink.streaming.api.connector.sink2.SupportsPreCommitTopology::addPreCommitTopology` API and mapping the incoming Committable DataStream to `.global()`. On to my questions: 1. For a given checkpointId, will the ` org.apache.flink.api.connector.sink2.Committer::commit` API *always* be called with *all* committables for that checkpointId? Is there any chance of only *some* of the committables for that checkpointId being delivered, perhaps due to a network delay, RPC delay, or even a lost end-of-interval RPC call? 2. If so, will the `org.apache.flink.api.connector.sink2.Committer::commit` API only ever be called with committables all belonging to the *same* checkpointId? Or could they belong to multiple checkpointIds? 3. Suppose that my SinkWriters have written and checkpointed their committables, and now the `Committer::commit` is attempting to persist them into external state (i.e. the _delta_log for Delta Lake). During this time, it may be desirable to force a fresh rewrite of the data referenced by the committables. However, if we fail the Committer, it will just be retried with the *same* committables due to Flink's exactly-once guarantees and checkpointing mechanisms. Is it possible to somehow request that the writers rewrite the data from the previous checkpointId? Thanks so much for the help! Very excited to contribute another Apache Flink connector! Cheers! -- [image: email_signature_logo_sm] *Scott Sandre* *Sr. Software Engineer* *Delta Ecosystem Team* *scott.san...@databricks.com <scott.san...@databricks.com>*