Dave Powell created KAFKA-3916:
----------------------------------
Summary: Connection from controller to broker disconnects
Key: KAFKA-3916
URL: https://issues.apache.org/jira/browse/KAFKA-3916
Project: Kafka
Issue Type: Bug
Components: controller
Affects Versions: 0.9.0.1
Reporter: Dave Powell
We recently upgraded from 0.8.2.1 to 0.9.0.1. Since then, several times per
day, the controllers in our clusters have their connection to all brokers
disconnected, and then successfully reconnected a few hundred ms later. Each
time this occurs we see a brief spike in our 99th percentile produce and
consume times, reaching several hundred ms.
Here is an example of what we're seeing in the controller.log:
{{
[2016-06-28 14:15:35,416] WARN [Controller-151-to-broker-160-send-thread],
Controller 151 epoch 106 fails to send request {…} to broker Node(160,
broker.160.hostname, 9092). Reconnecting to broker.
(kafka.controller.RequestSendThread)
java.io.IOException: Connection to 160 was disconnected before the response was
read
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:87)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1$$anonfun$apply$1.apply(NetworkClientBlockingOps.scala:84)
at scala.Option.foreach(Option.scala:236)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:84)
at
kafka.utils.NetworkClientBlockingOps$$anonfun$blockingSendAndReceive$extension$1.apply(NetworkClientBlockingOps.scala:80)
at
kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:129)
at
kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139)
at
kafka.utils.NetworkClientBlockingOps$.blockingSendAndReceive$extension(NetworkClientBlockingOps.scala:80)
at
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:180)
at
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
... one each for all brokers (including the controller) ...
[2016-06-28 14:15:35,721] INFO [Controller-151-to-broker-160-send-thread],
Controller 151 connected to Node(160, broker.160.hostname, 9092) for sending
state change requests (kafka.controller.RequestSendThread)
… one each for all brokers (including the controller) ...
}}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)