Hi Piotr, You are correct regarding the Savepoint, there should be no duplicates sent to RabbitMQ.
Best regards, Alexander On Thu, May 12, 2022 at 11:28 AM Piotr Domagalski <pi...@domagalski.com> wrote: > Hi, > > I'm planning to build a pipeline that is using Kafka source, some stateful > transformation and a RabbitMQ sink. What I don't yet fully understand is > how common should I expect the "at-least once" scenario (ie. seeing > duplicates) on the sink side. The case when things start failing is clear > to me, but what happens when I want to gracefully stop the Flink job? > > Am I right in thinking that when I gracefully stop a job with a final > savepoint [1] then what happens is that Kafka source stops consuming, a > checkpoint barrier is sent through the pipeline and this will flush the > sink completely? So my understanding is that if nothing fails and that > Kafka offset is committed, when the job is started again from that > savepoint, it will not result in any duplicates being sent to RabbitMQ. Is > that correct? > > Thanks! > > [1] > https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/cli/#stopping-a-job-gracefully-creating-a-final-savepoint > > -- > Piotr Domagalski > -- Alexander Preuß | Engineer - Data Intensive Systems alexanderpre...@ververica.com <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Ververica GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Karl Anton Wehner, Holger Temme, Yip Park Tung Jason, Jinwei (Kevin) Zhang