Re: How does at least once checkpointing work

Rex Fenley Tue, 12 Jan 2021 09:32:57 -0800

Thanks!

On Tue, Jan 12, 2021 at 1:56 AM Yuan Mei <yuanmei.w...@gmail.com> wrote:


>
>> It sounds like any state which does not have some form of uniqueness
>> could end up being incorrect.
>>
>> at least once usually works if the use case can tolerate a certain level
> of duplication or the computation is idempotent.
>
>
>> Specifically in my case, all rows passing through the execution graph
>> have unique ids. However, any operator from groupby foreign_key then
>> sum/count could end up with an inconsistent count. Normally a retract (-1)
>> and then insert (+1) would keep the count correct, but with "at least once"
>> a retract (-1) may be from epoch n+1 and therefore played twice, making the
>> count equal less than it should actually be.
>>
>>
> Not completely sure how the "retract (-1)" and "insert (+1)" work in your
> case, but "input data" that leads to a state change (count/sum change) is
> possible to be played twice after a recovery.
>
>
>> Am I understanding this correctly?
>>
>> Thanks!
>>
>> On Mon, Jan 11, 2021 at 10:06 PM Yuan Mei <yuanmei.w...@gmail.com> wrote:
>>
>>> Hey Rex,
>>>
>>> You probably will find the link below helpful; it explains how
>>> at-least-once (does not have alignment) is different
>>> from exactly-once(needs alignment). It also explains how the
>>> alignment phase is skipped in the at-least-once mode.
>>>
>>>
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.12/concepts/stateful-stream-processing.html#exactly-once-vs-at-least-once
>>>
>>> In a high level, at least once mode for a task with multiple input
>>> channels
>>> 1. does NOT block processing to wait for barriers from all inputs,
>>> meaning the task keeps processing data after receiving a barrier even if it
>>> has multiple inputs.
>>> 2. but still, a task takes a snapshot after seeing the checkpoint
>>> barrier from all input channels.
>>>
>>> In this way, a Snapshot N may contain data change coming from Epoch N+1;
>>> that's where "at least once" comes from.
>>>
>>> On Tue, Jan 12, 2021 at 1:03 PM Rex Fenley <r...@remind101.com> wrote:
>>>
>>>> Hello,
>>>>
>>>> We're using the TableAPI and want to optimize for checkpoint alignment
>>>> times. We received some advice to possibly use at-least-once. I'd like to
>>>> understand how checkpointing works in at-least-once mode so I understand
>>>> the caveats and can evaluate whether or not that will work for us.
>>>>
>>>> Thanks!
>>>> --
>>>>
>>>> Rex Fenley  |  Software Engineer - Mobile and Backend
>>>>
>>>>
>>>> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
>>>>  |  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
>>>> <https://www.facebook.com/remindhq>
>>>>
>>>
>>
>> --
>>
>> Rex Fenley  |  Software Engineer - Mobile and Backend
>>
>>
>> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
>>  |  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
>> <https://www.facebook.com/remindhq>
>>
>

-- 

Rex Fenley  |  Software Engineer - Mobile and Backend


Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
 |  FOLLOW
US <https://twitter.com/remindhq>  |  LIKE US
<https://www.facebook.com/remindhq>

Re: How does at least once checkpointing work

Reply via email to