Re: Checkpoint for exact-once proccessing

2016-01-13 Thread Don Frascuchon
Thanks to both !!. That's help me to understand the recovery process El mié., 13 ene. 2016 a las 14:01, Stephan Ewen () escribió: > Thanks, Gordon, for the nice answer! > > One thing is important to add: Exactly-once refers to state maintained by > Flink. All side effects (changes made to the "o

Re: Checkpoint for exact-once proccessing

2016-01-13 Thread Stephan Ewen
Thanks, Gordon, for the nice answer! One thing is important to add: Exactly-once refers to state maintained by Flink. All side effects (changes made to the "outside" world), which includes sinks, need in fact to be idempotent, or will only have "at-least once" semantics. In practice, this works o

Re: Checkpoint for exact-once proccessing

2016-01-13 Thread Tzu-Li (Gordon) Tai
Hi Francis, A part of every complete snapshot is the record positions associated with the barrier that triggered the checkpointing of this snapshot. The snapshot is completed only when all the records within the checkpoint reaches the sink. When a topology fails, all the operators' state will fall

Re: Checkpoint for exact-once proccessing

2016-01-13 Thread Don Frascuchon
Hi Stephan, Thanks for your quickly response. So, consider an operator task with two processed records and no barrier incoming. If the task fail and must be records, the last consistent snapshot will be used, which no includes information about the processed but no checkpointed records. What abo

Re: Checkpoint for exact-once proccessing

2016-01-13 Thread Stephan Ewen
Hi! I think there is a misunderstanding. There are no identifiers maintained and no individual records deleted. On recovery, all operators reset their state to a consistent snapshot: https://ci.apache.org/projects/flink/flink-docs-release-0.10/internals/stream_checkpointing.html Greetings, Step