Hi, I'm working on a custom implementation of a sink which I would like to use with exactly once semantics. Therefore I have implemented the TwoPhaseCommitSinkFunction class as mentioned in this recent post: https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
I have some integration tests which run jobs using the custom sink with a finite dataset (A RichSourceFunction with a "finite" run method). The tests fail because of missing data. I noticed that is due to the last transaction being aborted. When looking into the source code that makes sense because the close() implementation of TwoPhaseCommitSinkFunction calls abort on the current transaction: https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/sink/TwoPhaseCommitSinkFunction.java I could override this behaviour and perform a commit, but then I would perform a commit without getting the checkpoint completed notification, thus not properly maintaining exactly once guarantees Is (and how is) it possible to have end-to-end exactly once guarantees when dealing with (sometimes) finite jobs? Thanks! Niels