[ 
https://issues.apache.org/jira/browse/HIVE-5575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chun Chen updated HIVE-5575:
----------------------------

    Status: Patch Available  (was: Open)

> ZooKeeper connection closed when unlock with retry
> --------------------------------------------------
>
>                 Key: HIVE-5575
>                 URL: https://issues.apache.org/jira/browse/HIVE-5575
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.11.0
>            Reporter: Chun Chen
>            Assignee: Chun Chen
>             Fix For: 0.13.0
>
>         Attachments: D13515.1.patch, HIVE-5575.patch, zookeeper session 
> closed.png
>
>
> See the attachment, I have encountered a scenario that hive retries to  
> unlock all locks, but zookeeper session is closed. If there are hundreds of 
> locks, say dynamic partition, the process will hang up for several days.
> The stack is 
> {code}
> Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.21-b01 mixed mode):
> "Attach Listener" daemon prio=10 tid=0x000000000683f000 nid=0x34d0 waiting on 
> condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
>       - None
> "LeaseChecker" daemon prio=10 tid=0x0000000006693800 nid=0x2713 waiting on 
> condition [0x0000000042af7000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>       at java.lang.Thread.sleep(Native Method)
>       at 
> org.apache.hadoop.hdfs.DFSClient$LeaseChecker.run(DFSClient.java:1376)
>       at java.lang.Thread.run(Thread.java:722)
>    Locked ownable synchronizers:
>       - None
> "Service Thread" daemon prio=10 tid=0x00002aaab8001000 nid=0x2651 runnable 
> [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
>       - None
> "C2 CompilerThread1" daemon prio=10 tid=0x0000000005c7c800 nid=0x2650 waiting 
> on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
>       - None
> "C2 CompilerThread0" daemon prio=10 tid=0x0000000005c71000 nid=0x264f waiting 
> on condition [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
>       - None
> "Signal Dispatcher" daemon prio=10 tid=0x0000000005c6f000 nid=0x264e runnable 
> [0x0000000000000000]
>    java.lang.Thread.State: RUNNABLE
>    Locked ownable synchronizers:
>       - None
> "Finalizer" daemon prio=10 tid=0x0000000005c22000 nid=0x264d in Object.wait() 
> [0x00000000427f4000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
>       - locked <0x000000078324b110> (a java.lang.ref.ReferenceQueue$Lock)
>       at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
>       at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:189)
>    Locked ownable synchronizers:
>       - None
> "Reference Handler" daemon prio=10 tid=0x0000000005c1a000 nid=0x264c in 
> Object.wait() [0x0000000041900000]
>    java.lang.Thread.State: WAITING (on object monitor)
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:503)
>       at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
>       - locked <0x000000078328fbc0> (a java.lang.ref.Reference$Lock)
>    Locked ownable synchronizers:
>       - None
> "main" prio=10 tid=0x0000000005b76800 nid=0x263d waiting on condition 
> [0x0000000040f46000]
>    java.lang.Thread.State: TIMED_WAITING (sleeping)
>       at java.lang.Thread.sleep(Native Method)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlockWithRetry(ZooKeeperHiveLockManager.java:426)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.unlock(ZooKeeperHiveLockManager.java:415)
>       at 
> org.apache.hadoop.hive.ql.lockmgr.zookeeper.ZooKeeperHiveLockManager.releaseLocks(ZooKeeperHiveLockManager.java:257)
>       at org.apache.hadoop.hive.ql.Driver.releaseLocks(Driver.java:864)
>       at org.apache.hadoop.hive.ql.Driver.run(Driver.java:953)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259)
>       at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:413)
>       at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:348)
>       at 
> org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:446)
>       at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:456)
>       at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:712)
>       at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:614)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:601)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>    Locked ownable synchronizers:
>       - None
> "VM Thread" prio=10 tid=0x0000000005c12800 nid=0x264b runnable 
> "GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000005b84800 nid=0x263e 
> runnable 
> "GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000005b86000 nid=0x263f 
> runnable 
> "GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000005b88000 nid=0x2640 
> runnable 
> "GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000005b8a000 nid=0x2641 
> runnable 
> "GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000005b8b800 nid=0x2642 
> runnable 
> "GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000005b8d800 nid=0x2643 
> runnable 
> "GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000005b8f800 nid=0x2644 
> runnable 
> "GC task thread#7 (ParallelGC)" prio=10 tid=0x0000000005b91000 nid=0x2645 
> runnable 
> "GC task thread#8 (ParallelGC)" prio=10 tid=0x0000000005b93000 nid=0x2646 
> runnable 
> "GC task thread#9 (ParallelGC)" prio=10 tid=0x0000000005b95000 nid=0x2647 
> runnable 
> "GC task thread#10 (ParallelGC)" prio=10 tid=0x0000000005b96800 nid=0x2648 
> runnable 
> "GC task thread#11 (ParallelGC)" prio=10 tid=0x0000000005b98800 nid=0x2649 
> runnable 
> "GC task thread#12 (ParallelGC)" prio=10 tid=0x0000000005b9a800 nid=0x264a 
> runnable 
> "VM Periodic Task Thread" prio=10 tid=0x00002aaab800c000 nid=0x2652 waiting 
> on condition 
> JNI global references: 294
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to