[ https://issues.apache.org/jira/browse/KAFKA-2300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743533#comment-14743533 ]
ASF GitHub Bot commented on KAFKA-2300: --------------------------------------- GitHub user fpj opened a pull request: https://github.com/apache/kafka/pull/212 KAFKA-2300: Error in controller log when broker tries to rejoin cluster I have reopened this issue because the controller isn't cleaning up the state upon an exception and the test case was legitimately failing for me every now and then. I'm proposing a change to fix this. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fpj/kafka 2300 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/212.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #212 ---- commit dbd1bf3a91c3e15ed2d14bf941c41c87b8116608 Author: flavio junqueira <f...@apache.org> Date: 2015-07-29T17:07:51Z KAFKA-2300: Error in controller log when broker tries to rejoin cluster commit 9b6390ae1c474b90689ff53036120b4be44a3f8f Author: flavio junqueira <f...@apache.org> Date: 2015-07-29T22:36:16Z Updated package name and removed unnecessary imports. commit f1261b15b007d08e87d0ed56f7ec3fecbeddc276 Author: flavio junqueira <f...@apache.org> Date: 2015-07-30T09:57:34Z Fixed some style issues. commit aa6ec90b15ac6d0e0f9e5a58d4fed7b1909d50c2 Author: flavio junqueira <f...@apache.org> Date: 2015-08-12T16:37:07Z KAFKA-2300: Wrapped all occurences of sendRequestToBrokers with try/catch and fixed string typo. commit 7bd2edb83054a9be72dda3425930a68ea3ad494b Author: flavio junqueira <f...@apache.org> Date: 2015-08-12T16:40:13Z KAFKA-2300: Removed unnecessary s" occurrences. commit d5cfba343dac5967733c9415d4574256efdd764a Author: fpj <f...@apache.org> Date: 2015-09-14T13:00:15Z Merge remote-tracking branch 'upstream/trunk' into 2300 commit 742519349463c879d8413aee2b3f12b2ae8888a8 Author: fpj <f...@apache.org> Date: 2015-09-14T13:47:50Z KAFKA-2300: Cleaning the state of broker request batch upon an exception. ---- > Error in controller log when broker tries to rejoin cluster > ----------------------------------------------------------- > > Key: KAFKA-2300 > URL: https://issues.apache.org/jira/browse/KAFKA-2300 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.2.1 > Reporter: Johnny Brown > Assignee: Flavio Junqueira > Fix For: 0.9.0.0 > > Attachments: KAFKA-2300-controller-logs.tar.gz, > KAFKA-2300-repro.patch, KAFKA-2300.patch, KAFKA-2300.patch > > > Hello Kafka folks, > We are having an issue where a broker attempts to join the cluster after > being restarted, but is never added to the ISR for its assigned partitions. > This is a three-node cluster, and the controller is broker 2. > When broker 1 starts, we see the following message in broker 2's > controller.log. > {{ > [2015-06-23 13:57:16,535] ERROR [BrokerChangeListener on Controller 2]: Error > while handling broker changes > (kafka.controller.ReplicaStateMachine$BrokerChangeListener) > java.lang.IllegalStateException: Controller to broker state change requests > batch is not empty while creating a new one. Some UpdateMetadata state > changes Map(2 -> Map([prod-sver-end,1] -> > (LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1)), > 1 -> Map([prod-sver-end,1] -> > (LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1)), > 3 -> Map([prod-sver-end,1] -> > (LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1))) > might be lost > at > kafka.controller.ControllerBrokerRequestBatch.newBatch(ControllerChannelManager.scala:202) > at > kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:974) > at > kafka.controller.KafkaController.onBrokerStartup(KafkaController.scala:399) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:371) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359) > at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357) > at kafka.utils.Utils$.inLock(Utils.scala:535) > at > kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356) > at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568) > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > }} > {{prod-sver-end}} is a topic we previously deleted. It seems some remnant of > it persists in the controller's memory, causing an exception which interrupts > the state change triggered by the broker startup. > Has anyone seen something like this? Any idea what's happening here? Any > information would be greatly appreciated. > Thanks, > Johnny -- This message was sent by Atlassian JIRA (v6.3.4#6332)