Thanks again the for the detailed info. It makes a lot of sense.
One last question, can me create a checkpoint as soon as a job starts? In
this case, the first record’s offset will be in the checkpoint state and
this will provide “strong” guarantee as you said. Did I miss anything? I
read the code
Hi!
Also note that although this eventual consistency seems not good enough,
but for 99.99% of the time the job can run smoothly without failure. In
this case the records are correct and good. Only in the 0.01% case when the
job fails will user see inconsistency for a small period of time (for a
c
Hi!
Flink guarantees *eventual* consistency for systems without transactions
(by transaction I mean a system supporting writing a few records then
commit), or with transactions but users prefer latency than consistency.
That is to say, everything produced by Flink before a checkpoint is "not
secur
Hi Caizhi,
Thank you for the quick response. Can you help me understand how
reprocessing the data with the earliest starting-offset ensures exactly
once processing? 1st, the earliest offset could be way beyond the 1st
record in my example since the first time the job started from the latest
offset
Hi!
This is a valid case. This starting-offset is the offset for Kafka source
to read from when the job starts *without checkpoint*. That is to say, if
your job has been running for a while, completed several checkpoints and
then restarted, Kafka source won't read from starting-offset, but from th
Can someone help me understand how Flink deals with the following scenario?
I have a job that reads from a source Kafka (starting-offset: latest) and
writes to a sink Kafka with exactly-once execution. Let's say that I have 2
records in the source. The 1st one is processed without issue and the jo