Thanks, Gordon, for the nice answer!

One thing is important to add: Exactly-once refers to state maintained by
Flink. All side effects (changes made to the "outside" world), which
includes sinks, need in fact to be idempotent, or will only have "at-least
once" semantics.

In practice, this works often very well, because results can be computes
with "exactly once" semantics in Flink and are then sent "one or more
times" to the outside world (for example a database). If that database
simply overwrites/replaces values for keys (think upsert operation), this
gives end-to-end exactly-once semantics.

Greetings,
Stephan


On Wed, Jan 13, 2016 at 1:31 PM, Tzu-Li (Gordon) Tai <tzuli...@gmail.com>
wrote:

> Hi Francis,
>
> A part of every complete snapshot is the record positions associated with
> the barrier that triggered the checkpointing of this snapshot. The snapshot
> is completed only when all the records within the checkpoint reaches the
> sink. When a topology fails, all the operators' state will fall back to the
> latest complete snapshot (incomplete snapshots will be ignored). The data
> source will also fall back to the position recorded with this snapshot, so
> even if there are repeatedly read data records after the restore, the
> restored operator's state are also clean of the records effect. This way,
> Flink guarantees exactly-once effects of each record on every operator's
> state. The user functions in operators need not to be implemented
> idempotent.
>
> Hope this helps answer your question!
>
> Cheers,
> Gordon
>
>
>
> --
> View this message in context:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Checkpoint-for-exact-once-proccessing-tp4261p4264.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.
>

Reply via email to