[ https://issues.apache.org/jira/browse/HIVE-14187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15366713#comment-15366713 ]
Mohit Sabharwal commented on HIVE-14187: ---------------------------------------- Thanks [~vgumashta]. I took a quick look at HIVE-7353 and had a question. Regarding the comment "the threadpools keep a certain number of threads live and kill excess threads after a configurable keepAliveTime expires ... ideally this is where we'd plug in code to close the JDOPersistanceManager" -- this applies only to those threads where JDOPersistanceManager#close was not already called, correct ? In the remote metastore case, deleteContext() will get executed when the underling transport gets closed for that thread (in WorkerProcess Runnable run() method). https://github.com/apache/thrift/blob/master/lib/java/src/org/apache/thrift/server/TThreadPoolServer.java#L300 In this patch, we call cleanupRawStore inside deleteContext() (which calls JDOPersistanceManager#close) (If the client explicitly already called IMetaStoreClient#close on that connection, then the deleteContext() triggered cleanup will be a no-op.) I haven't looked at ThreadPoolExecutor code, but docs say "even core threads are initially created and started only when new tasks arrive", i.e. threads won't get pre-created (only to be killed later after keepalive timeout), but will get created when request arrives (which means there will be a transport associated with it, which will trigger deleteContext() upon connection break). IOW, thread goes back to the pool only after the deleteContext is called. And subsequently may get killed after keepalive timeout. So, unless someone manually kills the thread, it seems to me that deleteContext() covers the cleanup. (deleteContext() call is in the finally block so that covers errors/exceptions). What do you think ? > JDOPersistenceManager objects remain cached if MetaStoreClient#close is not > called > ---------------------------------------------------------------------------------- > > Key: HIVE-14187 > URL: https://issues.apache.org/jira/browse/HIVE-14187 > Project: Hive > Issue Type: Bug > Reporter: Mohit Sabharwal > Assignee: Mohit Sabharwal > Attachments: HIVE-14187.patch > > > JDOPersistenceManager objects are cached in JDOPersistenceManagerFactory by > DataNuclues. > A new JDOPersistenceManager object gets created for every HMS thread since > ObjectStore is a thread local. > In non-embedded metastore mode, JDOPersistenceManager associated with a > thread only gets cleaned up if IMetaStoreClient#close is called by the client > (which calls ObjectStore#shutdown which calls JDOPersistenceManager#close > which in turn removes the object from cache in > JDOPersistenceManagerFactory#releasePersistenceManager > https://github.com/datanucleus/datanucleus-api-jdo/blob/master/src/main/java/org/datanucleus/api/jdo/JDOPersistenceManagerFactory.java#L1271), > i.e. the object will remain cached if client does not call close. > For example: If one interrupts out of hive CLI shell (instead of using > 'exit;' command), SessionState#close does not get called, and hence > IMetaStoreClient#close does not get called. > Instead of relying the client to call close, it's cleaner to automatically > perform RawStore related cleanup at the server end via deleteContext() which > gets called when the server detects a lost/closed connection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)