[ https://issues.apache.org/jira/browse/KAFKA-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13948546#comment-13948546 ]
Timothy Chen commented on KAFKA-1317: ------------------------------------- Updated reviewboard https://reviews.apache.org/r/19577/ against branch origin/0.8.1 > KafkaServer 0.8.1 not responding to .shutdown() cleanly, possibly related to > TopicDeletionManager or MetricsMeter state > ----------------------------------------------------------------------------------------------------------------------- > > Key: KAFKA-1317 > URL: https://issues.apache.org/jira/browse/KAFKA-1317 > Project: Kafka > Issue Type: Bug > Affects Versions: 0.8.1 > Reporter: Brent Bradbury > Assignee: Timothy Chen > Priority: Blocker > Labels: newbie > Fix For: 0.8.1.1 > > Attachments: KAFKA-1317.patch, KAFKA-1317.patch, > KAFKA-1317_2014-03-23_23:48:28.patch, KAFKA-1317_2014-03-24_11:06:15.patch, > KAFKA-1317_2014-03-25_15:20:14.patch, KAFKA-1317_2014-03-26_09:48:03.patch, > KAFKA-1317_2014-03-26_11:30:57.patch, KAFKA-1317_2014-03-26_15:09:48.patch, > threaddump.txt > > > When I run an in-process instance of KafkaServer, send a message through it, > then call shutdown(), some threads never exit and the process hangs until the > process is killed manually. The same scenario does not result in a hang on > 0.8.0. The hang happens when calling both shutdown() by itself as well as > shutdown() and awaitShutdown() together. I have seen similar behavior > shutting down a deployed kafka server as well, but haven't had time to > diagnose whether or not it is the same symptom. > I suspect either the metrics-meter-tick-thread-1 & 2 or delete-topics-thread > (waiting in > kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$awaitTopicDeletionNotification(TopicDeletionManager.scala:178) > is to blame. Since the TopicDeletionManager is new, it seems more suspicious > to me. A complete thread dump is attached; the suspect threads are below. > "delete-topics-thread" prio=5 tid=0x00007fb3e31d2800 nid=0x6b03 waiting on > condition [0x000000013c3b3000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000012e6e6920> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$awaitTopicDeletionNotification(TopicDeletionManager.scala:178) > at > kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply$mcV$sp(TopicDeletionManager.scala:334) > at > kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:333) > at > kafka.controller.TopicDeletionManager$DeleteTopicsThread$$anonfun$doWork$1.apply(TopicDeletionManager.scala:333) > at kafka.utils.Utils$.inLock(Utils.scala:538) > at > kafka.controller.TopicDeletionManager$DeleteTopicsThread.doWork(TopicDeletionManager.scala:333) > at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:51) > Locked ownable synchronizers: > - None > "metrics-meter-tick-thread-2" daemon prio=5 tid=0x00007fb3e31c1000 nid=0x5f03 > runnable [0x000000013ab8f000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000012e7d05d8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > Locked ownable synchronizers: > - None > "metrics-meter-tick-thread-1" daemon prio=5 tid=0x00007fb3e31ef800 nid=0x5e03 > waiting on condition [0x000000013a98c000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x000000012e7d05d8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1085) > at > java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:724) > Locked ownable synchronizers: > - None -- This message was sent by Atlassian JIRA (v6.2#6252)