Hi Raghavendar, It sounds like you don't actually have flatMap logic, in which case you should use a sink instead of a flatMap. And probably one of the existing ones, as some of them already provide exactly-once guarantee [1].
[1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/guarantees.html Regards, Roman On Thu, Apr 29, 2021 at 5:55 PM Raghavendar T S <raghav280...@gmail.com> wrote: > Hi Roman > > I am just doing write operations from the flat map. Does it matter If I > use a flap map or sink for this purpose? > > Thank you > > > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > Virus-free. > www.avast.com > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> > <#m_8876029702709238175_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> > > On Thu, Apr 29, 2021 at 9:10 PM Roman Khachatryan <ro...@apache.org> > wrote: > >> Flink uses checkpoint barriers that are sent through along the same >> channels as data. Events are included into the checkpoint if they precede >> the corresponding barrier (or the RPC call for sources). [1] is the >> algorithm description and [2] is about integration with Kafka. >> >> > In my example, I have only 1 Source and 1 Flat Map. Do you mean to say >> that we need to use a sink instead of a flat map? >> I'm not sure I understand the use case. What do you do with the results >> of Flat Map? >> >> [1 https://arxiv.org/pdf/1506.08603.pdf >> [2] >> https://flink.apache.org/features/2018/03/01/end-to-end-exactly-once-apache-flink.html >> >> Regards, >> Roman >> >> >> On Thu, Apr 29, 2021 at 3:16 PM Raghavendar T S <raghav280...@gmail.com> >> wrote: >> >>> Hi Roman >>> >>> In general, how Flink tracks the events from source to downstream >>> operators? We usually emit existing events from an operator or create a new >>> instance of a class and emit it. How does Flink or Flink source know >>> whether the events are which snapshot? >>> >>> > So you don't need to re-process it manually (given that the sink >>> provides exactly once guarantee). >>> In my example, I have only 1 Source and 1 Flat Map. Do you mean to say >>> that we need to use a sink instead of a flat map? >>> >>> Thank you >>> >>> >>> >>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>> Virus-free. >>> www.avast.com >>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>> <#m_8876029702709238175_m_-7592641136541672980_m_9175636772900776859_m_7337441106478363842_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >>> >>> On Thu, Apr 29, 2021 at 6:13 PM Roman Khachatryan <ro...@apache.org> >>> wrote: >>> >>>> Hi Raghavendar, >>>> >>>> In Flink, checkpoints are global, meaning that a checkpoint is >>>> successful only if all operators acknowledge it. So the offset will be >>>> stored in state and then committed to Kafka [1] only after all the tasks >>>> acknowledge that checkpoint. At that moment, the element must be either >>>> emitted to the external system, stored in the operator state (e.g. window); >>>> or in channel state (with Unaligned checkpoints). >>>> >>>> So you don't need to re-process it manually (given that the sink >>>> provides exactly once guarantee). >>>> >>>> [1] >>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-offset-committing-behaviour-configuration >>>> >>>> Regards, >>>> Roman >>>> >>>> >>>> On Thu, Apr 29, 2021 at 12:53 PM Raghavendar T S < >>>> raghav280...@gmail.com> wrote: >>>> >>>>> Hi Team >>>>> >>>>> Assume that we have a job (Checkpoint enabled) with Kafka source and a >>>>> stateless operator which consumes events from Kafka source. >>>>> We have a stream of records 1, 2, 3, 4, 5,...Let's say the event 1 >>>>> reaches the Flat Map operator and is being processed. Then the Kafka >>>>> source >>>>> has made a successful checkpoint. >>>>> In this case, does the offset of event 1 will be part of the >>>>> checkpoint? Will Flink track the event from source to all downstream >>>>> operators? If this is a true case and If the processing of the event is >>>>> failed (any third party API/DB failure) in the Flat Map after a successful >>>>> checkpoint, do we need to manually re-process (retry using queue or any >>>>> other business logic) the event? >>>>> >>>>> Job: >>>>> Kafka Source -> Flat Map >>>>> >>>>> Thank you >>>>> >>>>> -- >>>>> Raghavendar T S >>>>> www.teknosrc.com >>>>> >>>>> >>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>> Virus-free. >>>>> www.avast.com >>>>> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >>>>> <#m_8876029702709238175_m_-7592641136541672980_m_9175636772900776859_m_7337441106478363842_m_8244812403188651414_m_-8513586134257021362_m_-166071601284684373_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >>>>> >>>> >>> >>> -- >>> Raghavendar T S >>> www.teknosrc.com >>> >> > > -- > Raghavendar T S > www.teknosrc.com >