> > > It sounds like any state which does not have some form of uniqueness could > end up being incorrect. > > at least once usually works if the use case can tolerate a certain level of duplication or the computation is idempotent.
> Specifically in my case, all rows passing through the execution graph have > unique ids. However, any operator from groupby foreign_key then sum/count > could end up with an inconsistent count. Normally a retract (-1) and then > insert (+1) would keep the count correct, but with "at least once" a > retract (-1) may be from epoch n+1 and therefore played twice, making the > count equal less than it should actually be. > > Not completely sure how the "retract (-1)" and "insert (+1)" work in your case, but "input data" that leads to a state change (count/sum change) is possible to be played twice after a recovery. > Am I understanding this correctly? > > Thanks! > > On Mon, Jan 11, 2021 at 10:06 PM Yuan Mei <yuanmei.w...@gmail.com> wrote: > >> Hey Rex, >> >> You probably will find the link below helpful; it explains how >> at-least-once (does not have alignment) is different >> from exactly-once(needs alignment). It also explains how the >> alignment phase is skipped in the at-least-once mode. >> >> >> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/stateful-stream-processing.html#exactly-once-vs-at-least-once >> >> In a high level, at least once mode for a task with multiple input >> channels >> 1. does NOT block processing to wait for barriers from all inputs, >> meaning the task keeps processing data after receiving a barrier even if it >> has multiple inputs. >> 2. but still, a task takes a snapshot after seeing the checkpoint barrier >> from all input channels. >> >> In this way, a Snapshot N may contain data change coming from Epoch N+1; >> that's where "at least once" comes from. >> >> On Tue, Jan 12, 2021 at 1:03 PM Rex Fenley <r...@remind101.com> wrote: >> >>> Hello, >>> >>> We're using the TableAPI and want to optimize for checkpoint alignment >>> times. We received some advice to possibly use at-least-once. I'd like to >>> understand how checkpointing works in at-least-once mode so I understand >>> the caveats and can evaluate whether or not that will work for us. >>> >>> Thanks! >>> -- >>> >>> Rex Fenley | Software Engineer - Mobile and Backend >>> >>> >>> Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> >>> | FOLLOW US <https://twitter.com/remindhq> | LIKE US >>> <https://www.facebook.com/remindhq> >>> >> > > -- > > Rex Fenley | Software Engineer - Mobile and Backend > > > Remind.com <https://www.remind.com/> | BLOG <http://blog.remind.com/> | > FOLLOW US <https://twitter.com/remindhq> | LIKE US > <https://www.facebook.com/remindhq> >