Bernhard Bonigl created KAFKA-4460:
--------------------------------------
Summary: Consumer stops getting messages when partition leader dies
Key: KAFKA-4460
URL: https://issues.apache.org/jira/browse/KAFKA-4460
Project: Kafka
Issue Type: Bug
Components: consumer
Affects Versions: 0.10.0.1
Reporter: Bernhard Bonigl
I have a setup consisting of 2 Kafka broker (0 and 1) using a zookeeper, a
spring boot application with producers and a spring boot application with
consumers.
The topic has 5 partitions and a replication factor of 2, both brokers are in
sync, partitions have alternating leader (although it doesn't matter).
The spring boot kafka configuration is setup as follows:
{code}
kafka.address: localhost:9092,localhost:9093
kafka.numberOfConsumers: 20
{code}
Where Broker 0 uses port 9092 and Broker 1 uses port 9093.
----
When sending events they are consumed just fine. When Broker 0 is killed all
topics get Broker 1 as their leader, however the consumers stop consuming
events until Broker 0 is back. This happens nearly every time, but usually it
takes at most 3 attempts of alternatively killing the leading broker to create
the error state.
The console log is getting spammed by the coordinators, it looks like the
coordinator representing broker 0 is marked as dead, but instantly rediscovered
and used again many many times, and only at the end the other broker is
discovered. When the switch works the log is only minimally spammed and the
other broker is discovered very quickly.
This gist contains the log of the application when the problem occurs. The
first line is a log of ours indicating a successfully consumed message. After
that the Broker 0 (localhost:9092) is killed - you can see the log spam I was
talking about. At the end localhost:9093 is discovered, however no further
messages are consumed. After that I killed the application.
----
I also discovered this unresolved stackoverflow question, which seems to be the
same problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)