[
https://issues.apache.org/jira/browse/KAFKA-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299832#comment-14299832
]
jaikiran pai commented on KAFKA-1907:
-------------------------------------
Created reviewboard https://reviews.apache.org/r/30477/diff/
against branch origin/trunk
> ZkClient can block controlled shutdown indefinitely
> ---------------------------------------------------
>
> Key: KAFKA-1907
> URL: https://issues.apache.org/jira/browse/KAFKA-1907
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8.2
> Reporter: Ewen Cheslack-Postava
> Attachments: KAFKA-1907.patch
>
>
> There are some calls to ZkClient via ZkUtils in
> KafkaServer.controlledShutdown() that can block indefinitely because they
> internally call waitUntilConnected. The ZkClient API doesn't provide an
> alternative with timeouts, so fixing this will require enforcing timeouts in
> some other way.
> This may be a more general issue if there are any non daemon threads that
> also call ZkUtils methods.
> Stacktrace showing the issue:
> {code}
> "Thread-2" prio=10 tid=0xb3305000 nid=0x4758 waiting on condition [0x6ad69000]
> java.lang.Thread.State: TIMED_WAITING (parking)
> at sun.misc.Unsafe.park(Native Method)
> - parking to wait for <0x70a93368> (a
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
> at java.util.concurrent.locks.LockSupport.parkUntil(LockSupport.java:267)
> at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitUntil(AbstractQueuedSynchronizer.java:2130)
> at org.I0Itec.zkclient.ZkClient.waitForKeeperState(ZkClient.java:636)
> at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:619)
> at org.I0Itec.zkclient.ZkClient.waitUntilConnected(ZkClient.java:615)
> at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:679)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766)
> at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761)
> at kafka.utils.ZkUtils$.readDataMaybeNull(ZkUtils.scala:456)
> at kafka.utils.ZkUtils$.getController(ZkUtils.scala:65)
> at
> kafka.server.KafkaServer.kafka$server$KafkaServer$$controlledShutdown(KafkaServer.scala:194)
> at
> kafka.server.KafkaServer$$anonfun$shutdown$1.apply$mcV$sp(KafkaServer.scala:269)
> at kafka.utils.Utils$.swallow(Utils.scala:172)
> at kafka.utils.Logging$class.swallowWarn(Logging.scala:92)
> at kafka.utils.Utils$.swallowWarn(Utils.scala:45)
> at kafka.utils.Logging$class.swallow(Logging.scala:94)
> at kafka.utils.Utils$.swallow(Utils.scala:45)
> at kafka.server.KafkaServer.shutdown(KafkaServer.scala:269)
> at
> kafka.server.KafkaServerStartable.shutdown(KafkaServerStartable.scala:42)
> at kafka.Kafka$$anon$1.run(Kafka.scala:42)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)