The way I imagine this is that the sink would have its "own checkpoints" separately from the rest of the system, and with much smaller interval, and writes to Kafka (with "transactional cooperation", as Stephan mentioned) during making these checkpoints. And then when a replay happens from a global system checkpoint, it can look at its own checkpoints to decide for each tuple whether to send it or not.
@Stephan: > That assumes deterministic streams and to some extend deterministic tuple > order. > That may be given sometimes, but it is a very strong assumption in many cases. Ah yes, you are right. But doing everything based on event time points in this direction of deterministic streams, right?