[ https://issues.apache.org/jira/browse/HIVE-24294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Work on HIVE-24294 started by Naresh P R. ----------------------------------------- > TezSessionPool sessions can throw AssertionError > ------------------------------------------------ > > Key: HIVE-24294 > URL: https://issues.apache.org/jira/browse/HIVE-24294 > Project: Hive > Issue Type: Bug > Reporter: Naresh P R > Assignee: Naresh P R > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Whenever default TezSessionPool sessions are reopened for some reason, we are > setting dagResources to null before close & setting it back in openWhenever > default TezSessionPool sessions are reopened for some reason, we are setting > dagResources to null before close & setting it back in open > https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezSessionPoolManager.java#L498-L503 > If there is an exception in sessionState.close(), we are not restoring the > dagResource but moving the session back to TezSessionPool.eg., exception > trace when sessionState.close() failed > {code:java} > 2020-10-15T09:20:28,749 INFO [HiveServer2-Background-Pool: Thread-25451]: > client.TezClient (:()) - Failed to shutdown Tez Session via proxy > org.apache.tez.dag.api.SessionNotRunning: Application not running, > applicationId=application_1602093123456_12345, yarnApplicationState=FINISHED, > finalApplicationStatus=SUCCEEDED, > trackingUrl=http://localhost:8088/proxy/application_1602093123456_12345/, > diagnostics=Session timed out, lastDAGCompletionTime=1602997683786 ms, > sessionTimeoutInterval=600000 ms > Session stats:submittedDAGs=2, successfulDAGs=2, failedDAGs=0, killedDAGs=0 > at > org.apache.tez.client.TezClientUtils.getAMProxy(TezClientUtils.java:910) > at org.apache.tez.client.TezClient.getAMProxy(TezClient.java:1060) > at org.apache.tez.client.TezClient.stop(TezClient.java:743) > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.closeClient(TezSessionState.java:789) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.close(TezSessionState.java:756) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.close(TezSessionPoolSession.java:111) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopenInternal(TezSessionPoolManager.java:496) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolManager.reopen(TezSessionPoolManager.java:487) > > at > org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.reopen(TezSessionPoolSession.java:228) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.getNewTezSessionOnError(TezTask.java:531) > > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.submit(TezTask.java:546) > at > org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:221){code} > Because of this, all new queries using this corrupted sessions are failing > with below exception > {code:java} > Caused by: java.lang.AssertionError: Ensure called on an unitialized (or > closed) session 41774265-b7da-4d58-84a8-1bedfd597aecCaused by: > java.lang.AssertionError: Ensure called on an unitialized (or closed) session > 41774265-b7da-4d58-84a8-1bedfd597aec at > org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:685){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)