The main benefit of letting Flink keep the offsets is that you get exactly
once semantics (with the offsets in Flink state, it is aligned with all
your other state).
When storing the offsets in Kafka, you get at least once semantics (= you
are seeing some messages twice on restore / when continuing)

On Thu, Feb 13, 2020 at 2:56 PM Timothy Victor <vict...@gmail.com> wrote:

> What are the pros and cons of Kafka offset keeping vs Flink offset
> keeping?  Is one more reliable than the other?   Personally I prefer having
> flink manage it due to it being intrinsically tied to its checkpointing
> mechanism.  But interested to learn from others experiences.
>
> Thanks
>
> Tim
>
> On Thu, Feb 13, 2020, 12:39 AM Hegde, Mahendra <mahendra.he...@arity.com>
> wrote:
>
>> Thanks Theo !
>>
>>
>>
>> *From: *"theo.diefent...@scoop-software.de" <
>> theo.diefent...@scoop-software.de>
>> *Date: *Thursday, 13 February 2020 at 12:13 AM
>> *To: *"Hegde, Mahendra" <mahendra.he...@arity.com>, "
>> user@flink.apache.org" <user@flink.apache.org>
>> *Subject: *[External] AW: How Flink Kafka Consumer works when it restarts
>>
>>
>>
>> Hi Mahendra,
>>
>>
>>
>> Flink will regularly create checkpoints or manually triggered savepoints.
>> This is data managed and stored by Flink and that data also contains the
>> kafka offsets.
>>
>>
>>
>> When restarting, you can configure to restart from the last checkpoint
>> and or savepoint.
>>
>>
>>
>> You can additionally configure Flink to commit the offsets to kafka,
>> again, on checkpoint only. You can then configure Flink to restart from the
>> committed offset, if you don't let Flink restart from an existing
>> checkpoint or savepoint, where it would first search in to retore the
>> offsets.
>>
>>
>>
>> Having the offsets loaded either from checkpoint, savepoint or kafka, it
>> will directly communicate with Kafka and ask kafka to poll messages
>> starting from those offsets.
>>
>>
>>
>> Best regards
>>
>> Theo
>>
>>
>>
>>
>> Von meinem Huawei-Telefon gesendet
>>
>>
>>
>> -------- Ursprüngliche Nachricht --------
>> Von: "Hegde, Mahendra" <mahendra.he...@arity.com>
>> Datum: Mi., 12. Feb. 2020, 17:50
>> An: user@flink.apache.org
>> Betreff: How Flink Kafka Consumer works when it restarts
>>
>> Hi All,
>>
>>
>>
>> I am bit confused on Flink kafka consumer working.
>>
>> I read that Flink stores the kafka message offset in checkpoint and uses
>> it in case if it restarts.
>>
>>
>>
>> Question is when exactly Flink is committing about successful consumption
>> confirmation to kafka broker?
>>
>> And when Flink job restarts will it send last offset which is available
>> in checkpoint to kafka broker to start consuming from that point ?
>>
>> Or Kafka broker will resume based on last committed offset information
>> available?
>>
>> (I mean who manages the actual offset here, Kafka broker or the Flink
>> client)
>>
>>
>>
>> Thanks
>>
>> Mahendra
>>
>

Reply via email to