[ https://issues.apache.org/jira/browse/KAFKA-1663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14159752#comment-14159752 ]
Jun Rao commented on KAFKA-1663: -------------------------------- Neha, Since this is a blocker for 0.8.2, could you push this to the 0.8.2 branch too? Thanks, > Controller unable to shutdown after a soft failure > -------------------------------------------------- > > Key: KAFKA-1663 > URL: https://issues.apache.org/jira/browse/KAFKA-1663 > Project: Kafka > Issue Type: Bug > Reporter: Sriharsha Chintalapani > Assignee: Sriharsha Chintalapani > Priority: Blocker > Fix For: 0.8.2 > > Attachments: KAFKA-1663.patch > > > As part of testing KAFKA-1558 I came across a case where inducing soft > failure in the current controller elects a new controller but the old > controller doesn't shutdown properly. > steps to reproduce > 1) 5 broker cluster > 2) high number of topics(I tested it with 1000 topics) > 3) on the current controller do kill -SIGSTOP pid( broker's process id) > 4) wait for bit over zookeeper timeout (server.properties) > 5) kill -SIGCONT pid > 6) There will be a new controller elected. check old controller's > log > [2014-09-30 15:59:53,398] INFO [SessionExpirationListener on 1], ZK expired; > shut down all controller components and try to re-elect > (kafka.controller.KafkaController$SessionExpirationListener) > [2014-09-30 15:59:53,400] INFO [delete-topics-thread-1], Shutting down > (kafka.controller.TopicDeletionManager$DeleteTopicsThread) > If it stops there and the broker logs keeps printing > Cached zkVersion [0] not equal to that in zookeeper, skip updating ISR > (kafka.cluster.Partition) > than the controller shutdown never completes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)