Hi,
Thanks for your analysis.
We found LeaderElectionRateAndTimeMs go to non-zero value on Kafka around
the same time when this error was seen in the Flink job.
Kafka itself recovers from this and so do any other consumers that we have.
It seems like a bug in kafka consumer library if this error
Probably your kafka consumer is rebalancing. This can be due to a bigger
message processing time due to which kafka broker is marking your consumer
dead and rebalancing. This all happens before the consumer can commit the
offsets.
On Mon, Jun 11, 2018 at 7:37 PM Piotr Nowojski
wrote:
> The more
The more I look into it, the more it seems like a Kafka bug or some cluster
failure from which your Kafka cluster did not recover.
In your cases auto committing should be set to true and in that case
KafkaConsumer should commit offsets once every so often when it’s polling
messages. Unless for
Hi Piotr, thanks for your insights.
> What’s your KafkaConsumer configuration?
We only set these in the properties that are passed to
FlinkKafkaConsumer010 constructor:
auto.offset.reset=latest
bootstrap.servers=my-kafka-host:9092
group.id=my_group
flink.partition-discovery.interval-millis=3
Hi,
What’s your KafkaConsumer configuration? Especially values for:
- is checkpointing enabled?
- enable.auto.commit (or auto.commit.enable for Kafka 0.8) /
auto.commit.interval.ms
- did you set setCommitOffsetsOnCheckpoints() ?
Please also refer to
https://ci.apache.org/projects/flink/flink-do