[ https://issues.apache.org/jira/browse/HIVE-20442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16904053#comment-16904053 ]
Rajkumar Singh commented on HIVE-20442: --------------------------------------- the test timed out seems unrelated but uploading a fresh patch for a clean run. > Hive stale lock when the hiveserver2 background thread died with NPE > -------------------------------------------------------------------- > > Key: HIVE-20442 > URL: https://issues.apache.org/jira/browse/HIVE-20442 > Project: Hive > Issue Type: Bug > Components: Hive, Transactions > Affects Versions: 1.2.0, 2.1.1 > Environment: Hive-2.1 > Reporter: Rajkumar Singh > Assignee: Rajkumar Singh > Priority: Major > Attachments: HIVE-20442.01.branch-2.patch, > HIVE-20442.1-branch-1.2.patch, HIVE-20442.2-branch-1.2.patch, > HIVE-20442.3-branch-1.2.patch > > > this look like a race condition where background thread is not able to > release the lock it aquired. > 1. hiveserver2 background thread request for lock > {code} > 2018-08-20T14:13:38,813 INFO [HiveServer2-Background-Pool: Thread-XXXXX]: > lockmgr.DbLockManager (DbLockManager.java:lock(100)) - Requesting: > queryId=hive_xxxxxxx LockRequest(component:[LockComponent(type:SHARED_READ, > level:TABLE, dbname:testdb, tablename:test_table, operationType:SELECT)], > txnid:0, user:hive, hostname:HOSTNAME, agentInfo:hive_xxxxxxx) > {code} > 2. acquired the lock and start heartbeating > {code} > 2018-08-20T14:36:30,233 INFO [HiveServer2-Background-Pool: Thread-XXXXX]: > lockmgr.DbTxnManager (DbTxnManager.java:startHeartbeat(517)) - Started > heartbeat with delay/interval = 150000/150000 MILLISECONDS for > query: agentInfo:hive_xxxxxxx > {code} > 3. during time between event #1 and #2, client disconnected and deleteContext > cleanup the session dir > {code} > 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXX]: > thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(136)) - > Session disconnected without closing properly. > 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXXX]: > thrift.ThriftCLIService (ThriftBinaryCLIService.java:deleteContext(140)) - > Closing the session: SessionHandle [3be07faf-5544-4178-8b50-8173002b171a] > 2018-08-21T15:39:57,820 INFO [HiveServer2-Handler-Pool: Thread-XXXX]: > service.CompositeService (SessionManager.java:closeSession(363)) - Session > closed, SessionHandle [xxxxxxxxxxxxxxxxxxxxxxx], current sessions:2 > {code} > 4. background thread died with NPE while trying to get the queryid > {code} > java.lang.NullPointerException: null > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1568) > ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1414) > ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1211) > ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1204) > ~[hive-exec-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at > org.apache.hive.service.cli.operation.SQLOperation.runQuery(SQLOperation.java:242) > [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at > org.apache.hive.service.cli.operation.SQLOperation.access$800(SQLOperation.java:91) > [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at > org.apache.hive.service.cli.operation.SQLOperation$BackgroundWork$1.run(SQLOperation.java:336) > [hive-service-2.1.0.2.6.5.0-292.jar:2.1.0.2.6.5.0-292] > at java.security.AccessController.doPrivileged(Native Method) > [?:1.8.0_77] > at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_77] > {code} > did not get a chance to release the lock and heartbeater thread continue > heartbeat indefinately. -- This message was sent by Atlassian JIRA (v7.6.14#76016)