[ https://issues.apache.org/jira/browse/KAFKA-1310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fedor Korotkiy updated KAFKA-1310: ---------------------------------- Affects Version/s: 0.8.1 > Zookeeper timeout causes deadlock in Controller > ----------------------------------------------- > > Key: KAFKA-1310 > URL: https://issues.apache.org/jira/browse/KAFKA-1310 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.1 > Reporter: Fedor Korotkiy > > Steps to reproduce: > 1. Checkout and build 0.8.1 branch from github: > git clone g...@github.com:apache/kafka.git && cd kafka && git checkout > origin/0.8.1 && ./gradlew jar > 2. Start zookeeper server: > ./bin/zookeeper-server-start.sh config/zookeeper.properties > 3. Start kafka server: > ./bin/kafka-server-start.sh config/server.properties > 4. Suspend zookeeper process for 10 seconds (ctrl-Z, then %1). > 5. And kafka hasn't been re-registered in zookeeper. > ./bin/zookeeper-shell.sh > ls /brokers/ids > >> [] > Root cause of the problem seems to be the deadlock between DeleteTopicsThread > and SessionExpirationListener in KafkaController. > 1. DeleteTopicsThread acquires controllerLock and await()-s on > deleteTopicsCond in awaitTopicDeletionNotification() > 2. SessionExpirationListener fires. It acquires controllerLock and tries to > shutdown deleteTopicManager(in onControllerResignation()). This interrupts > DeleteTopicsThread. > 3. DeleteTopicsThread can't return from deleteTopicsCond.await() because > controllerLock is taken. We got a deadlock. -- This message was sent by Atlassian JIRA (v6.2#6252)