I'm running into a problem with a 3 broker cluster where,
intermittently, one of the broker's controller begins to report that
it cannot connect to the other brokers and repeatedly logs the
failure.

Each broker is running in its own Docker container on separate
machines.  These Docker containers have exposed 9092, which I think is
sufficient for operation, but not sure.

The log message are these:
[2017-04-27 17:16:28,985] WARN [Controller-3-to-broker-2-send-thread], 
Controller 3's connection to broker 64174aa85d04:9092 (id: 2 rack: null) was 
unsuccessful (kafka.controller.RequestSendThread)
java.io.IOException: Connection to 64174aa85d04:9092 (id: 2 rack: null) failed
        at 
kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:84)
        at 
kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:94)
        at 
kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:232)
        at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:185)
        at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:184)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
[2017-04-27 17:16:28,986] WARN [Controller-3-to-broker-1-send-thread], 
Controller 3's connection to broker d4b8943ad4b5:9092 (id: 1 rack: null) was 
unsuccessful (kafka.controller.RequestSendThread)
java.io.IOException: Connection to d4b8943ad4b5:9092 (id: 1 rack: null) failed
        at 
kafka.utils.NetworkClientBlockingOps$.awaitReady$1(NetworkClientBlockingOps.scala:84)
        at 
kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:94)
        at 
kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:232)
        at 
kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:185)
        at 
kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:184)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)

This is Kafka 2.12-0.10.2.0. I'm wondering:

1. How do we figure out the cause of the connect failures?
2. What's the controller anyway?
3. Are there some command-line diagnostic tools for inspecting the health of 
the system?

Thanks for any help,
Chuck

Reply via email to