Sriharsha Chintalapani created KAFKA-1663: ---------------------------------------------
Summary: Controller unable to shutdown after a soft failure Key: KAFKA-1663 URL: https://issues.apache.org/jira/browse/KAFKA-1663 Project: Kafka Issue Type: Bug Reporter: Sriharsha Chintalapani As part of testing KAFKA-1558 I came across a case where inducing soft failure in the current controller elects a new controller but the old controller doesn't shutdown properly. steps to reproduce 1) 5 broker cluster 2) high number of topics(I tested it with 1000 topics) 3) on the current controller do kill -SIGSTOP pid( broker's process id) 4) wait for bit over zookeeper timeout (server.properties) 5) kill -SIGCONT pid 6) There will be a new controller elected. check old controller's log [2014-09-30 15:59:53,398] INFO [SessionExpirationListener on 1], ZK expired; shut down all controller components and try to re-elect (kafka.controller.KafkaController$SessionExpirationListener) [2014-09-30 15:59:53,400] INFO [delete-topics-thread-1], Shutting down (kafka.controller.TopicDeletionManager$DeleteTopicsThread) If it stops there and the broker logs keeps printing Cached zkVersion [0] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition) than the controller shutdown never completes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)