[ https://issues.apache.org/jira/browse/HIVE-25085?focusedWorklogId=596777&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-596777 ]
ASF GitHub Bot logged work on HIVE-25085: ----------------------------------------- Author: ASF GitHub Bot Created on: 14/May/21 17:14 Start Date: 14/May/21 17:14 Worklog Time Spent: 10m Work Description: scarlin-cloudera commented on a change in pull request #2238: URL: https://github.com/apache/hive/pull/2238#discussion_r632676371 ########## File path: service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java ########## @@ -408,18 +400,10 @@ private synchronized void acquireAfterOpLock(boolean userAccess) { // set the thread name with the logging prefix. sessionState.updateThreadName(); - // If the thread local Hive is different from sessionHive, it means, the previous query execution in - // master thread has re-created Hive object due to changes in MS related configurations in sessionConf. - // So, it is necessary to reset sessionHive object based on new sessionConf. Here, we cannot, - // directly set sessionHive with thread local Hive because if the previous command was REPL LOAD, then - // the config changes lives only within command execution not in session level. - // So, the safer option is to invoke Hive.get() which decides if to reuse Thread local Hive or re-create it. - if (Hive.getThreadLocal() != sessionHive) { - try { - setSessionHive(); - } catch (HiveSQLException e) { - throw new RuntimeException(e); - } + try { Review comment: Not sure I understand this comment. The "if" clause went away since we do not depend on anything with thread local storage to setSessionHive. The format changed only because the "if" statement went away. I guess I could leave it indented an extra 2 spaces...is that what you're suggesting? ########## File path: service/src/java/org/apache/hive/service/cli/session/HiveSessionImpl.java ########## @@ -240,21 +240,13 @@ protected int processCmd(String cmd) { * @throws HiveSQLException */ private void setSessionHive() throws HiveSQLException { - Hive newSessionHive; try { - newSessionHive = Hive.get(getHiveConf()); - - // HMS connections from sessionHive shouldn't be closed by any query execution thread when it - // recreates the Hive object. It is allowed to be closed only when session is closed/released. - newSessionHive.setAllowClose(false); + sessionHive = sessionState.getHiveDb(); } catch (HiveException e) { - throw new HiveSQLException("Failed to get metastore connection", e); Review comment: Done ########## File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java ########## @@ -2210,6 +2218,18 @@ public void endScope(String queryId) { public Map<Object, Object> getQueryCache(String queryId) { return cache.get(queryId); } + + public Hive getHiveDb() throws HiveException { + if (hiveDb == null) { + hiveDb = Hive.createHiveForSession(sessionConf); + // Need to setAllowClose to false. For legacy reasons, the Hive object is stored + // in thread local storage. If allowClose is true, the session can get closed when + // the thread goes away which is not desirable when the Hive object is used across + // different queries in the session. + hiveDb.setAllowClose(false); Review comment: I'm not sure I understand this. We are not closing here. This is just a value which ensures that when the thread goes away, the session stored in thread local storage will not be closed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 596777) Time Spent: 1h 10m (was: 1h) > MetaStore Clients are being shared across different sessions > ------------------------------------------------------------ > > Key: HIVE-25085 > URL: https://issues.apache.org/jira/browse/HIVE-25085 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Reporter: Steve Carlin > Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > The Hive object (and the underlying MetaStoreClient object) seems to be > getting shared across different sessions. While most operations work, there > can be occasional glitches. > One such noted glitch is that when session 1 ends, it closes the connection. > If session 2 then tries an operation, the first try will fail. Normally this > can proceed because the RetryingMetaStoreClient will re-establish a new > connection, but in some operations, the retrying logic will not kick in (by > design). > It seems there was an attempt to fix this issue in HIVE-20682. However, this > implementation seems to be flawed. The HiveSessionImpl object creates a Hive > object and makes sure all thread queries belonging to the same session will > run with the same Hive object. The flaw is that the initial Hive Object > within HiveSessionImpl is created in thread local storage. The thread being > run at that moment is not session specific. It belongs to a thread pool that > happens to be handling this specific session. > -- This message was sent by Atlassian Jira (v8.3.4#803005)