Hi Guys, I'm having an issue with a kafka stream app, at some point I get a
consumer leave group message. Exactly same issue described to another
person here:

https://stackoverflow.com/questions/61245480/how-to-detect-a-kafka-streams-app-in-zombie-state

But the issue is that stream state is continuing reporting that the stream
is running, but it's not consuming anything, but the stream never rejoin
the consumer group, so my application with only one replica stop consuming.

I have a health check on Kubernetes where I expose the stream state to then
restart the pod.
But as the kafka stream state it's always running when the consumer leaves
the group, the app is still healthy in zombie state, so I need to manually
go and restart the pod.

Is this a bug? Or is there a way to check what is the stream consumer state
to then expose as healt check for my application?

This issue really happen randomly, usually all the Mondays. I'm using Kafka
2.8.1 and my app is made in kotlin.

This is the message I get before zombie state, then there are no
exceptions, errors or nothing more until I restart the pod manually.

Sending LeaveGroup request to coordinator
b-3.c4.kafka.us-east-1.amazonaws.com:9098 (id: 2147483644 rack: null)
due to consumer poll timeout has expired. This means the time between
subsequent calls to poll() was longer than the configured
max.poll.interval.ms, which typically implies that the poll loop is
spending too much time processing messages. You can address this
either by increasing max.poll.interval.ms or by reducing the maximum
size of batches returned in poll() with max.poll.records.


Thanks for the help.

Reply via email to