Fedor Korotkiy created KAFKA-1310:
-------------------------------------

             Summary: Zookeeper timeout causes deadlock in Controller
                 Key: KAFKA-1310
                 URL: https://issues.apache.org/jira/browse/KAFKA-1310
             Project: Kafka
          Issue Type: Bug
            Reporter: Fedor Korotkiy


Steps to reproduce:

1. Checkout and build 0.8.1 branch from github:
git clone g...@github.com:apache/kafka.git && cd kafka && git checkout 
origin/0.8.1 && ./gradlew jar

2. Start zookeeper server:
./bin/zookeeper-server-start.sh config/zookeeper.properties

3. Start kafka server:
./bin/kafka-server-start.sh config/server.properties

4. Suspend zookeeper process for 10 seconds (ctrl-Z, then %1).

5. And kafka hasn't been re-registered in zookeeper.
./bin/zookeeper-shell.sh
ls /brokers/ids
>> []

Root cause of the problem seems to be the deadlock between DeleteTopicsThread 
and SessionExpirationListener in KafkaController.

1. DeleteTopicsThread acquires controllerLock and await()-s on deleteTopicsCond 
in awaitTopicDeletionNotification()

2. SessionExpirationListener fires. It acquires controllerLock and tries to 
shutdown deleteTopicManager(in onControllerResignation()). This interrupts 
DeleteTopicsThread.

3. DeleteTopicsThread can't return from deleteTopicsCond.await() because 
controllerLock is taken. We got a deadlock.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to