Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, Checkpointing is Flink's way of providing processing guarantees "at least once"/"exactly once". So your question is like asking if a car offers any safety without you wanting to use a built-in belt and airbags. Sure you could install your own safety features but chances are that your sol

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Rahul Patwari
Hi Arvid, Thanks for your inputs. They are super helpful. Why do you need the window operator at all? Couldn't you just backpressure > on the async I/O by delaying the processing there? > I haven't explored this approach. Wouldn't the backpressure gets propagated upstream and the consumption rat

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, This pipeline should process millions of records per day with low latency. > I am avoiding Checkpointing, as the records in the Window operator and > in-flight records in the Async I/O operator are persisted along with the > Kafka offsets. But the records in Window and Async I/O operator

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Rahul Patwari
Hi Arvid, Thanks for the reply. could you please help me to understand how the at least once guarantee > would work without checkpointing in your case? > This was the plan to maintain "at least once" guarantee: Logic at Sink: The DataStream on which Sink Function is applied, on the same DataStre

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Arvid Heise
Hi Rahul, could you please help me to understand how the at least once guarantee would work without checkpointing in your case? Let's say you read records A, B, C. You use a window to delay processing, so let's say A passes and B, C are still in the window for the trigger. Now do you want to aut

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-13 Thread Roman Khachatryan
Hi Rahul, Right. There are no workarounds as far as I know. Regards, Roman On Mon, Apr 12, 2021 at 9:00 PM Rahul Patwari wrote: > > Hi Roman, Arvid, > > So, to achieve "at least once" guarantee, currently, automatic restart of > Flink should be disabled? > Is there any workaround to get "at le

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-12 Thread Rahul Patwari
Hi Roman, Arvid, So, to achieve "at least once" guarantee, currently, automatic restart of Flink should be disabled? Is there any workaround to get "at least once" semantics with Flink Automatic restarts in this case? Regards, Rahul On Mon, Apr 12, 2021 at 9:57 PM Roman Khachatryan wrote: > Hi

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-12 Thread Roman Khachatryan
Hi, Thanks for the clarification. > Other than managing offsets externally, Are there any other ways to guarantee > "at least once" processing without enabling checkpointing? That's currently not possible, at least with the default connector. Regards, Roman On Mon, Apr 12, 2021 at 3:14 PM Rah

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-12 Thread Rahul Patwari
Hi Roman, Thanks for the reply. This is what I meant by Internal restarts - Automatic restore of Flink Job from a failure. For example, pipeline restarts when Fixed delay or Failure

Re: Manage Kafka Offsets for Fault Tolerance Without Checkpointing

2021-04-12 Thread Roman Khachatryan
Hi, Could you please explain what you mean by internal restarts? If you commit offsets or timestamps from sink after emitting records to the external system then there should be no data loss. Otherwise (if you commit offsets earlier), you have to persist in-flight records to avoid data loss (i.e.