[ https://issues.apache.org/jira/browse/KAFKA-4360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15622377#comment-15622377 ]
Json Tu commented on KAFKA-4360: -------------------------------- [~guozhang] [~junrao] do you think this is may be a bug, if so, I will be pleased to put a pull request for it. > Controller may deadLock when autoLeaderRebalance encounter zk expired > --------------------------------------------------------------------- > > Key: KAFKA-4360 > URL: https://issues.apache.org/jira/browse/KAFKA-4360 > Project: Kafka > Issue Type: Bug > Components: controller > Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.0, 0.10.0.1 > Reporter: Json Tu > Labels: bugfix > Attachments: yf-mafka2-common02_jstack.txt > > Original Estimate: 168h > Remaining Estimate: 168h > > when controller has checkAndTriggerPartitionRebalance task in > autoRebalanceScheduler,and then zk expired at that time. It will > run into deadlock. > we can restore the scene as below,when zk session expired,zk thread will call > handleNewSession which defined in SessionExpirationListener, and it will get > controllerContext.controllerLock,and then it will > autoRebalanceScheduler.shutdown(),which need complete all the task in the > autoRebalanceScheduler,but that threadPoll also need get > controllerContext.controllerLock,but it has already owned by zk callback > thread,which will then run into deadlock. > because of that,it will cause two problems at least, first is the broker’s id > is cannot register to the zookeeper,and it will be considered as dead by new > controller,second this procedure can not be stop by kafka-server-stop.sh, > because shutdown function > can not get controllerContext.controllerLock also, we cannot shutdown kafka > except using kill -9. > I running a jstack on my kafka procedure when I using kafka-server-stop.sh to > close kafka but not success, which is put in my attachment. > I have met this scenes for several times,I think this may be a bug that not > solved in kafka,can I give a pull request to kafka? -- This message was sent by Atlassian JIRA (v6.3.4#6332)