[ https://issues.apache.org/jira/browse/HIVE-3537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555953#comment-13555953 ]
Ashutosh Chauhan commented on HIVE-3537: ---------------------------------------- Seems like there is some problem. My tests have been running for more than 15 hours and they seem to have been stuck. e.g., following tests extraordinarily long to complete: {code} [junit] Done query: bucketmapjoin5.q elapsedTime=4421s [junit] Done query: bucketmapjoin4.q elapsedTime=4419s [junit] Done query: bucketmapjoin2.q elapsedTime=6625s [junit] Done query: bucketmapjoin3.q elapsedTime=4414s {code} I was able to capture jstack for one of it. {code} jstack 14269 2013-01-16 23:29:23 Full thread dump Java HotSpot(TM) Server VM (10.0-b19 mixed mode): "Attach Listener" daemon prio=10 tid=0x08786400 nid=0x74a4 waiting on condition [0x00000000..0x00000000] java.lang.Thread.State: RUNNABLE "main-EventThread" daemon prio=10 tid=0xce4ae000 nid=0x63ac waiting on condition [0xcedfe000..0xcedff030] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd684c6b8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:493) "main-SendThread(localhost:21818)" daemon prio=10 tid=0xcf142400 nid=0x63ab runnable [0xce9ad000..0xce9adfb0] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xd6854078> (a sun.nio.ch.Util$1) - locked <0xd6854068> (a java.util.Collections$UnmodifiableSet) - locked <0xd6853bf8> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:274) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) "main-EventThread" daemon prio=10 tid=0xce4aa800 nid=0x63aa waiting on condition [0xce95c000..0xce95d130] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd6856b70> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:493) "main-SendThread(localhost:21818)" daemon prio=10 tid=0xce4aa400 nid=0x63a9 runnable [0xcf5d7000..0xcf5d80b0] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xd6857720> (a sun.nio.ch.Util$1) - locked <0xd6857710> (a java.util.Collections$UnmodifiableSet) - locked <0xd68572a0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:274) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) "main-EventThread" daemon prio=10 tid=0xcf4f3c00 nid=0x37f1 waiting on condition [0xcf35c000..0xcf35d030] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd5528060> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:493) "main-SendThread(localhost:21818)" daemon prio=10 tid=0xcf11f000 nid=0x37f0 runnable [0xce9fe000..0xce9fefb0] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xd5528d48> (a sun.nio.ch.Util$1) - locked <0xd5528d38> (a java.util.Collections$UnmodifiableSet) - locked <0xd55288c8> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:274) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1035) "derby.rawStoreDaemon" daemon prio=10 tid=0x08edd800 nid=0x37e6 in Object.wait() [0xcf3ad000..0xcf3ae030] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd50539f0> (a org.apache.derby.impl.services.daemon.BasicDaemon) at org.apache.derby.impl.services.daemon.BasicDaemon.rest(Unknown Source) - locked <0xd50539f0> (a org.apache.derby.impl.services.daemon.BasicDaemon) at org.apache.derby.impl.services.daemon.BasicDaemon.run(Unknown Source) at java.lang.Thread.run(Thread.java:619) "Timer-0" daemon prio=10 tid=0x09070c00 nid=0x37e3 in Object.wait() [0xcf3fe000..0xcf3ff0b0] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd4e87170> (a java.util.TaskQueue) at java.lang.Object.wait(Object.java:485) at java.util.TimerThread.mainLoop(Timer.java:483) - locked <0xd4e87170> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) "derby.antiGC" daemon prio=10 tid=0x08da4800 nid=0x37e2 in Object.wait() [0xcf586000..0xcf586e30] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd4e8a188> (a org.apache.derby.impl.services.monitor.AntiGC) at java.lang.Object.wait(Object.java:485) at org.apache.derby.impl.services.monitor.AntiGC.run(Unknown Source) - locked <0xd4e8a188> (a org.apache.derby.impl.services.monitor.AntiGC) at java.lang.Thread.run(Thread.java:619) "ProcessThread(sid:0 cport:-1):" prio=10 tid=0xcfed1000 nid=0x37d7 waiting on condition [0xcf679000..0xcf679e30] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd4df4e60> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:119) "SyncThread:0" prio=10 tid=0xcfed4800 nid=0x37d6 waiting on condition [0xcf6ca000..0xcf6cadb0] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xd4df74d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1925) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:358) at org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:97) "SessionTracker" prio=10 tid=0xcfd99800 nid=0x37d5 in Object.wait() [0xcf71b000..0xcf71bf30] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd4df7e80> (a org.apache.zookeeper.server.SessionTrackerImpl) at org.apache.zookeeper.server.SessionTrackerImpl.run(SessionTrackerImpl.java:146) - locked <0xd4df7e80> (a org.apache.zookeeper.server.SessionTrackerImpl) "NIOServerCxn.Factory:0.0.0.0/0.0.0.0:21818" daemon prio=10 tid=0xcfcd9000 nid=0x37d4 runnable [0xcf76c000..0xcf76ceb0] java.lang.Thread.State: RUNNABLE at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0xd4d4a100> (a sun.nio.ch.Util$1) - locked <0xd4d4a0f0> (a java.util.Collections$UnmodifiableSet) - locked <0xd4d49ec0> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:194) at java.lang.Thread.run(Thread.java:619) "Low Memory Detector" daemon prio=10 tid=0x0868e400 nid=0x37d0 runnable [0x00000000..0x00000000] java.lang.Thread.State: RUNNABLE "CompilerThread1" daemon prio=10 tid=0x0868cc00 nid=0x37cf waiting on condition [0x00000000..0xd00f05f8] java.lang.Thread.State: RUNNABLE "CompilerThread0" daemon prio=10 tid=0x0868a400 nid=0x37ce waiting on condition [0x00000000..0xd0171578] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" daemon prio=10 tid=0x08689000 nid=0x37cd runnable [0x00000000..0xd01c2b10] java.lang.Thread.State: RUNNABLE "Finalizer" daemon prio=10 tid=0x0866e000 nid=0x37cc in Object.wait() [0xd0413000..0xd0413db0] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd4dcd590> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:116) - locked <0xd4dcd590> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:132) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:159) "Reference Handler" daemon prio=10 tid=0x0866cc00 nid=0x37cb in Object.wait() [0xd0464000..0xd0464f30] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0xd4d54fa8> (a java.lang.ref.Reference$Lock) at java.lang.Object.wait(Object.java:485) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:116) - locked <0xd4d54fa8> (a java.lang.ref.Reference$Lock) "main" prio=10 tid=0x08583000 nid=0x37c1 waiting on condition [0xf7ee7000..0xf7ee8208] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:415) at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:404) at org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:251) at org.apache.hadoop.hive.ql.Driver.releaseLocks(Driver.java:862) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:948) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348) at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:774) at org.apache.hadoop.hive.cli.TestCliDriver.runTest(TestCliDriver.java:5609) at org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_bucketmapjoin5(TestCliDriver.java:1657) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052) at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906) "VM Thread" prio=10 tid=0x08669c00 nid=0x37ca runnable "GC task thread#0 (ParallelGC)" prio=10 tid=0x08589c00 nid=0x37c2 runnable "GC task thread#1 (ParallelGC)" prio=10 tid=0x0858b000 nid=0x37c3 runnable "GC task thread#2 (ParallelGC)" prio=10 tid=0x0858c000 nid=0x37c4 runnable "GC task thread#3 (ParallelGC)" prio=10 tid=0x0858d000 nid=0x37c5 runnable "GC task thread#4 (ParallelGC)" prio=10 tid=0x0858e000 nid=0x37c6 runnable "GC task thread#5 (ParallelGC)" prio=10 tid=0x0858f400 nid=0x37c7 runnable "GC task thread#6 (ParallelGC)" prio=10 tid=0x08590400 nid=0x37c8 runnable "GC task thread#7 (ParallelGC)" prio=10 tid=0x08591400 nid=0x37c9 runnable "VM Periodic Task Thread" prio=10 tid=0x0868fc00 nid=0x37d1 waiting on condition JNI global references: 888 {code} Though none of the tests have failed and there is no deadlock in this jstack. But this stack and extra-ordinarily long run time of these tests suggest, something is up here. Are you observing similar behavior in your environment? > release locks at the end of move tasks > -------------------------------------- > > Key: HIVE-3537 > URL: https://issues.apache.org/jira/browse/HIVE-3537 > Project: Hive > Issue Type: Bug > Components: Locking, Query Processor > Reporter: Namit Jain > Assignee: Namit Jain > Attachments: hive.3537.1.patch, hive.3537.2.patch, hive.3537.3.patch, > hive.3537.4.patch, hive.3537.5.patch, hive.3537.6.patch, hive.3537.7.patch, > hive.3537.8.patch, hive.3537.9.patch > > > Look at HIVE-3106 for details. > In order to make sure that concurrency is not an issue for multi-table > inserts, the current option is to introduce a dependency task, which thereby > delays the creation of all partitions. It would be desirable to release the > locks for the outputs as soon as the move task is completed. That way, for > multi-table inserts, the concurrency can be enabled without delaying any > table. > Currently, the movetask contains a input/output, but they do not seem to be > populated correctly. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira