Re: Clarification on checkpoint/savepoint usage with Kafka (and RabbitMQ)

mejri houssem Tue, 25 Mar 2025 17:54:17 -0700

Hello,

Is there any further clarification or explanation regarding the subject,
please?


Best regards.

Le mer. 19 mars 2025 à 15:02, mejri houssem <[email protected]> a
écrit :

> Hello,
>
> So if I understand you well,  I cannot rely on the kafka broker offset to
> achieve at-least-once guarantee. Without checkpoint/savepoint enabled,
> that would not be possible.
>
> Best regards
>
> Le mer. 19 mars 2025 à 12:00, Ahmed Hamdy <[email protected]> a écrit :
>
>> Hi Mejri,
>> Not exactly, you can still rely on savepoint to restart/redeploy the job
>> from the latest offset recorded in Flink, my reply was regarding your
>> question if you can replace that and just depend on the committed offsets
>> in the kafka broker. For at-least-once semantic savepoints and checkpoints
>> book-keep the offset for the Flink job after the initialization, the config
>> I mentioned only configures the initialization of the consumers. If you
>> start the job without savepoint and it falls back to the config (which may
>> be using the broker committed offset) that might achieve the semantic but
>> it doesn't guarantee that.
>> For example assume you restore from save point, job completes a couple of
>> checkpoints hence the offset committed is updated in kafka then for some
>> reason you figure out a bug, if you only depend on Kafka broker committed
>> offset you would probably break the semantic while if you use savepoints
>> you can redeploy from the last correct version savepoint and reprocess the
>> data that was processed by the buggy job.
>>
>> Best Regards
>> Ahmed Hamdy
>>
>>
>> On Wed, 19 Mar 2025 at 00:54, mejri houssem <[email protected]>
>> wrote:
>>
>>> Hello Ahmed,
>>>
>>> Thanks for the response.
>>>
>>> Does that mean checkpoints and savepoints have nothing to do with the
>>> at-least-once guarantee, since it depends solely on the starting offset
>>> configuration?
>>>
>>> Best Regards
>>>
>>> Le mar. 18 mars 2025 à 23:59, Ahmed Hamdy <[email protected]> a
>>> écrit :
>>>
>>>> Hi Mejri
>>>>
>>>> > I’m wondering if this is strictly necessary, since the Kafka broker
>>>> itself keeps track of offsets (i am not mistaken). In other words, if we
>>>> redeploy the job, will it automatically resume from the last Kafka offset,
>>>> or should we still rely on Flink’s checkpoint/savepoint mechanism to ensure
>>>> correct offset recovery?
>>>>
>>>> This depends on the starting offset you set in the source config[1].
>>>> you can configure it to start from earliest or last committed or latest or
>>>> at specific offset.
>>>>
>>>> I am not 100% sure about RabbitMQ, IIRC it uses checkpoints to ack read
>>>> messages unlike Kafka.
>>>>
>>>>
>>>> 1-
>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/datastream/kafka/#starting-offset
>>>> Best Regards
>>>> Ahmed Hamdy
>>>>
>>>>
>>>> On Tue, 18 Mar 2025 at 22:20, mejri houssem <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>> Hello everyone,
>>>>>
>>>>> We have a stateless Flink job that uses a Kafka source with
>>>>> at-least-once guarantees. We’ve enabled checkpoints so that, in the event
>>>>> of a restart, Flink can restore from the last committed offset stored in a
>>>>> successful checkpoint. Now we’re considering enabling savepoints for our
>>>>> production deployment.
>>>>>
>>>>> I’m wondering if this is strictly necessary, since the Kafka broker
>>>>> itself keeps track of offsets (i am not mistaken). In other words, if we
>>>>> redeploy the job, will it automatically resume from the last Kafka offset,
>>>>> or should we still rely on Flink’s checkpoint/savepoint mechanism to 
>>>>> ensure
>>>>> correct offset recovery?
>>>>>
>>>>> Additionally, we have another job that uses a RabbitMQ source with
>>>>> checkpoints enabled to manage manual acknowledgments. Does the same logic
>>>>> apply in that case as well?
>>>>>
>>>>> Thanks in advance for any guidance!point enabled in order to activate
>>>>> manual ack. Does this apply to this job also?
>>>>>
>>>>> Thanks in advance.
>>>>>
>>>>>
>>>>> Best Regards.
>>>>>
>>>>

Re: Clarification on checkpoint/savepoint usage with Kafka (and RabbitMQ)

Reply via email to