José Armando García Sancio created KAFKA-19343: --------------------------------------------------
Summary: Misconfigured broker listeners causes unrecoverable controller error Key: KAFKA-19343 URL: https://issues.apache.org/jira/browse/KAFKA-19343 Project: Kafka Issue Type: Bug Components: controller Affects Versions: 3.9.0 Reporter: José Armando García Sancio With a missed configured broker it is possible for the controller to throw this exception. {code:java} [2025-05-07 08:47:50,245] ERROR Haven't been able to send leader and isr requests, current state of the map is HashMap(1 -> LeaderAndIsrBatch(version=6, brokerId=1, brokerEpoch=111669150272, controllerId=6, controllerEpoch=201, containsAllReplicas=false, numParti tions=334, numTopicIds=159, numLiveLeaders=2), 3 -> LeaderAndIsrBatch(version=6, brokerId=3, brokerEpoch=0, controllerId=0, controllerEpoch=0, containsAllReplicas=false, numPartitions=302, numTopicIds=178, numLiveLeaders=0), 4 -> LeaderAndIsrBatch(version=6, brok erId=4, brokerEpoch=0, controllerId=0, controllerEpoch=0, containsAllReplicas=false, numPartitions=228, numTopicIds=163, numLiveLeaders=0), 5 -> LeaderAndIsrBatch(version=6, brokerId=5, brokerEpoch=0, controllerId=0, controllerEpoch=0, containsAllReplicas=false, numPartitions=33, numTopicIds=19, numLiveLeaders=0), 6 -> LeaderAndIsrBatch(version=6, brokerId=6, brokerEpoch=0, controllerId=0, controllerEpoch=0, containsAllReplicas=false, numPartitions=261, numTopicIds=149, numLiveLeaders=0)). Exception message: kafka.common.BrokerEndPointNotAvailableException: End point with listener name INTERNAL_SCRAM not found for broker 1 (kafka.controller.ControllerBrokerRequestBatch) [2025-05-07 08:47:50,245] ERROR Haven't been able to send metadata update requests, current state of the map is HashMap(1 -> UpdateMetadataBatch(version=7, brokerId=1, brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTo picIds=203, numLiveBrokers=0), 3 -> UpdateMetadataBatch(version=7, brokerId=3, brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicIds=203, numLiveBrokers=0), 4 -> UpdateMetadataBatch(version=7, brokerId=4, brokerEpoc h=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicIds=203, numLiveBrokers=0), 5 -> UpdateMetadataBatch(version=7, brokerId=5, brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicI ds=203, numLiveBrokers=0), 6 -> UpdateMetadataBatch(version=7, brokerId=6, brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicIds=203, numLiveBrokers=0)). Exception message: kafka.common.BrokerEndPointNotAvailableException: End point with listener name INTERNAL_SCRAM not found for broker 1 (kafka.controller.ControllerBrokerRequestBatch) [2025-05-07 08:47:50,246] ERROR Haven't been able to send stop replica requests, current state of the map is HashMap(2 -> StopReplicaBatch(version=3, brokerId=2, brokerEpoch=0, controllerId=0, controllerEpoch=0, numPartitions=549, numTopicIds=203)). Exception message: kafka.common.BrokerEndPointNotAvailableException: End point with listener name INTERNAL_SCRAM not found for broker 1 (kafka.controller.ControllerBrokerRequestBatch) [2025-05-07 08:47:50,247] ERROR [ReplicaStateMachine controllerId=6] Error while moving some replicas to OfflineReplica state (kafka.controller.ZkReplicaStateMachine) java.lang.IllegalStateException: kafka.common.BrokerEndPointNotAvailableException: End point with listener name INTERNAL_SCRAM not found for broker 1 at kafka.controller.AbstractControllerBrokerRequestBatch.sendRequestsToBrokers(ControllerChannelManager.scala:708) at kafka.controller.ZkReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:120) at kafka.controller.KafkaController.onReplicasBecomeOffline(KafkaController.scala:770) at kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:734) at kafka.controller.KafkaController.processBrokerChange(KafkaController.scala:1906) at kafka.controller.KafkaController.process(KafkaController.scala:3426) at kafka.controller.QueuedEvent.process(ControllerEventManager.scala:52) at kafka.controller.ControllerEventManager$ControllerEventThread.process$1(ControllerEventManager.scala:130) at kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:133) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18) at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31) at kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:133) at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:99) Caused by: kafka.common.BrokerEndPointNotAvailableException: End point with listener name INTERNAL_SCRAM not found for broker 1 at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:96) at scala.Option.getOrElse(Option.scala:201) at kafka.cluster.Broker.node(Broker.scala:95) at kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$3(ControllerChannelManager.scala:599) at scala.collection.mutable.HashSet$Node.foreach(HashSet.scala:435) at scala.collection.mutable.HashSet.foreach(HashSet.scala:361) at kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$1(ControllerChannelManager.scala:597) at kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$1$adapted(ControllerChannelManager.scala:588) at kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62) at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633) at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499) at kafka.controller.AbstractControllerBrokerRequestBatch.sendLeaderAndIsrRequest(ControllerChannelManager.scala:588) at kafka.controller.AbstractControllerBrokerRequestBatch.sendRequestsToBrokers(ControllerChannelManager.scala:691) ... 12 more {code} It looks like the controller doesn’t handle this error correctly. If the broker configuration is changed to be correct, the controller cannot recover from the error. The work around is to force a controller failover. -- This message was sent by Atlassian Jira (v8.20.10#820010)