Problem with the committed offsets while using KafkaSource and having the checkpointing enabled

Konstantinos Karavitis Wed, 05 Mar 2025 04:18:27 -0800

Hi team!
We have observed strange behavior when using KafkaSource and have
checkpointing enabled.


Even if we do not get checkpoint failures or any means of errors, we see
the committed offset going up and down in a crazy manner instead of moving
up to reach the end offset of that X topic.
We know that Flink does not rely on Kafka for committed offsets, etc, but
it captures that important info in the checkpoints it makes.
Flink updates Kafka as a means of reporting/monitoring tool only.

So in our example, we hava kafkaSource, and we consume from a topic A where
it has endOffset: 60_000.
The committedOffset after some time reaches the 30_000, and then 40_000,
etc.
At some point, it reaches the 47_000 and then drops back down not in a
fashioned pattern, eg, 47_000 -> 23_000 -> 27_000 -> 36_000 -> 22_000 and
so on and so forth.
Unfortunately, there is not any error or a checkpoint failure so that we
could explain that behavior.
Moreover, we would like to try unaligned checkpoint to opt-out if
backpressure is doing that harm, but that is out of options at the moment.

What do you suggest? Have you experienced that before?
Thanks in advance.

Problem with the committed offsets while using KafkaSource and having the checkpointing enabled

Reply via email to