Re: How does at least once checkpointing work

Yuan Mei Tue, 12 Jan 2021 01:57:09 -0800

>
>
> It sounds like any state which does not have some form of uniqueness could
> end up being incorrect.
>
> at least once usually works if the use case can tolerate a certain level
of duplication or the computation is idempotent.



> Specifically in my case, all rows passing through the execution graph have
> unique ids. However, any operator from groupby foreign_key then sum/count
> could end up with an inconsistent count. Normally a retract (-1) and then
> insert (+1) would keep the count correct, but with "at least once" a
> retract (-1) may be from epoch n+1 and therefore played twice, making the
> count equal less than it should actually be.
>
>
Not completely sure how the "retract (-1)" and "insert (+1)" work in your
case, but "input data" that leads to a state change (count/sum change) is
possible to be played twice after a recovery.


> Am I understanding this correctly?
>
> Thanks!
>
> On Mon, Jan 11, 2021 at 10:06 PM Yuan Mei <yuanmei.w...@gmail.com> wrote:
>
>> Hey Rex,
>>
>> You probably will find the link below helpful; it explains how
>> at-least-once (does not have alignment) is different
>> from exactly-once(needs alignment). It also explains how the
>> alignment phase is skipped in the at-least-once mode.
>>
>>
>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/stateful-stream-processing.html#exactly-once-vs-at-least-once
>>
>> In a high level, at least once mode for a task with multiple input
>> channels
>> 1. does NOT block processing to wait for barriers from all inputs,
>> meaning the task keeps processing data after receiving a barrier even if it
>> has multiple inputs.
>> 2. but still, a task takes a snapshot after seeing the checkpoint barrier
>> from all input channels.
>>
>> In this way, a Snapshot N may contain data change coming from Epoch N+1;
>> that's where "at least once" comes from.
>>
>> On Tue, Jan 12, 2021 at 1:03 PM Rex Fenley <r...@remind101.com> wrote:
>>
>>> Hello,
>>>
>>> We're using the TableAPI and want to optimize for checkpoint alignment
>>> times. We received some advice to possibly use at-least-once. I'd like to
>>> understand how checkpointing works in at-least-once mode so I understand
>>> the caveats and can evaluate whether or not that will work for us.
>>>
>>> Thanks!
>>> --
>>>
>>> Rex Fenley  |  Software Engineer - Mobile and Backend
>>>
>>>
>>> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
>>>  |  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
>>> <https://www.facebook.com/remindhq>
>>>
>>
>
> --
>
> Rex Fenley  |  Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>  |
>  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
> <https://www.facebook.com/remindhq>
>

Re: How does at least once checkpointing work

Reply via email to