Hi Raghavendar,

It sounds like you don't actually have flatMap logic, in which case you
should use a sink instead of a flatMap.
And probably one of the existing ones, as some of them already provide
exactly-once guarantee [1].

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/guarantees.html
Regards,
Roman


On Thu, Apr 29, 2021 at 5:55 PM Raghavendar T S <raghav280...@gmail.com>
wrote:

> Hi Roman
>
> I am just doing write operations from the flat map. Does it matter If I
> use a flap map or sink for this purpose?
>
> Thank you
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>  Virus-free.
> www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
> <#m_8876029702709238175_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>
> On Thu, Apr 29, 2021 at 9:10 PM Roman Khachatryan <ro...@apache.org>
> wrote:
>
>> Flink uses checkpoint barriers that are sent through along the same
>> channels as data. Events are included into the checkpoint if they precede
>> the corresponding barrier (or the RPC call for sources). [1] is the
>> algorithm description and [2] is about integration with Kafka.
>>
>> > In my example, I have only 1 Source and 1 Flat Map. Do you mean to say
>> that we need to use a sink instead of a flat map?
>> I'm not sure I understand the use case. What do you do with the results
>> of Flat Map?
>>
>> [1 https://arxiv.org/pdf/1506.08603.pdf
>> [2]
>> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html
>>
>> Regards,
>> Roman
>>
>>
>> On Thu, Apr 29, 2021 at 3:16 PM Raghavendar T S <raghav280...@gmail.com>
>> wrote:
>>
>>> Hi Roman
>>>
>>> In general, how Flink tracks the events from source to downstream
>>> operators? We usually emit existing events from an operator or create a new
>>> instance of a class and emit it. How does Flink or Flink source know
>>> whether the events are which snapshot?
>>>
>>> > So you don't need to re-process it manually (given that the sink
>>> provides exactly once guarantee).
>>> In my example, I have only 1 Source and 1 Flat Map. Do you mean to say
>>> that we need to use a sink instead of a flat map?
>>>
>>> Thank you
>>>
>>>
>>>
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>  Virus-free.
>>> www.avast.com
>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>> <#m_8876029702709238175_m_-7592641136541672980_m_9175636772900776859_m_7337441106478363842_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>
>>> On Thu, Apr 29, 2021 at 6:13 PM Roman Khachatryan <ro...@apache.org>
>>> wrote:
>>>
>>>> Hi Raghavendar,
>>>>
>>>> In Flink, checkpoints are global, meaning that a checkpoint is
>>>> successful only if all operators acknowledge it. So the offset will be
>>>> stored in state and then committed to Kafka [1] only after all the tasks
>>>> acknowledge that checkpoint. At that moment, the element must be either
>>>> emitted to the external system, stored in the operator state (e.g. window);
>>>> or in channel state (with Unaligned checkpoints).
>>>>
>>>> So you don't need to re-process it manually (given that the sink
>>>> provides exactly once guarantee).
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration
>>>>
>>>> Regards,
>>>> Roman
>>>>
>>>>
>>>> On Thu, Apr 29, 2021 at 12:53 PM Raghavendar T S <
>>>> raghav280...@gmail.com> wrote:
>>>>
>>>>> Hi Team
>>>>>
>>>>> Assume that we have a job (Checkpoint enabled) with Kafka source and a
>>>>> stateless operator which consumes events from Kafka source.
>>>>> We have a stream of records 1, 2, 3, 4, 5,...Let's say the event 1
>>>>> reaches the Flat Map operator and is being processed. Then the Kafka 
>>>>> source
>>>>> has made a successful checkpoint.
>>>>> In this case, does the offset of event 1 will be part of the
>>>>> checkpoint? Will Flink track the event from source to all downstream
>>>>> operators? If this is a true case and If the processing of the event is
>>>>> failed (any third party API/DB failure) in the Flat Map after a successful
>>>>> checkpoint, do we need to manually re-process (retry using queue or any
>>>>> other business logic) the event?
>>>>>
>>>>> Job:
>>>>> Kafka Source -> Flat Map
>>>>>
>>>>> Thank you
>>>>>
>>>>> --
>>>>> Raghavendar T S
>>>>> www.teknosrc.com
>>>>>
>>>>>
>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>>>  Virus-free.
>>>>> www.avast.com
>>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail>
>>>>> <#m_8876029702709238175_m_-7592641136541672980_m_9175636772900776859_m_7337441106478363842_m_8244812403188651414_m_-8513586134257021362_m_-166071601284684373_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>>>>>
>>>>
>>>
>>> --
>>> Raghavendar T S
>>> www.teknosrc.com
>>>
>>
>
> --
> Raghavendar T S
> www.teknosrc.com
>

Reply via email to