[ 
https://issues.apache.org/jira/browse/HIVE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14087857#comment-14087857
 ] 

Vaibhav Gumashta commented on HIVE-7353:
----------------------------------------

Thanks for the review comments. I've taken a different approach now and the 
problem seems more generic. Here's the core issue:
RawStore is kept as a threadlocal variable. A RawStore object has a reference 
to the JDOPersistanceManager object which JDOPersistanceManagerFactory caches. 
To remove the JDOPersistanceManager from the cache, an explicit 
JDOPersistanceManager#close call is required. 
The issue is, that in HiveServer2, we keep 2 threadpools (handler - binary 
mode/http mode & async) managed by an ExecutorService.  Based on the config, 
the threadpools keep a certain number of threads live and kill excess threads 
after a configurable keepAliveTime expires. However, ExecutorService does not 
provide a hook to plug in custom cleanup code when a thread is killed - ideally 
this is where we'd plug in code to close the JDOPersistanceManager stored in 
the threadlocal RawStore.

The current approach I've taken provides a custom ThreadFactory while creating 
the threadpool, which has a finalize method that does the cleanup. The 
ThreadFactory also maintains a map of RawStore object per Thread and in the 
finalize method of each thread, retrieves the RawStore object from the map, and 
performs the shutdown.

On another note, remote metastore also uses ExecutorService for maintaining its 
ThreadPool. I haven't tested there, but similar problem should exist in that 
case.

> HiveServer2 using embedded MetaStore leaks JDOPersistanceManager
> ----------------------------------------------------------------
>
>                 Key: HIVE-7353
>                 URL: https://issues.apache.org/jira/browse/HIVE-7353
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 0.13.0
>            Reporter: Vaibhav Gumashta
>            Assignee: Vaibhav Gumashta
>             Fix For: 0.14.0
>
>         Attachments: HIVE-7353.1.patch, HIVE-7353.2.patch
>
>
> While using embedded metastore, while creating background threads to run 
> async operations, HiveServer2 ends up creating new instances of 
> JDOPersistanceManager rather than using the one from the foreground (handler) 
> thread. Since JDOPersistanceManagerFactory caches JDOPersistanceManager 
> instances, they are never GCed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to