[ 
https://issues.apache.org/jira/browse/HIVE-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sankar Hariappan updated HIVE-20682:
------------------------------------
    Description: 
*Problem description:*

The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
class when we open a new session for a client connection and by default all 
queries from this connection shares the same sessionHive object. 

If the master thread executes a *synchronous* query, it closes the sessionHive 
object (referred via thread local hiveDb) if  {{Hive.isCompatible}} returns 
false and sets new Hive object in thread local HiveDb but doesn't change the 
sessionHive object in the session. Whereas, *asynchronous* query execution via 
async threads never closes the sessionHive object and it just creates a new one 
if needed and sets it as their thread local hiveDb.

So, the problem can happen in the case where an *asynchronous* query is being 
executed by async threads refers to sessionHive object and the master thread 
receives a *synchronous* query that closes the same sessionHive object. 

Also, each query execution overwrites the thread local hiveDb object to 
sessionHive object which potentially leaks a metastore connection if the 
previous synchronous query execution re-created the Hive object.

*Possible Fix:*

The *sessionHive* object could be shared my multiple threads and so it 
shouldn't be allowed to be closed by any query execution threads when they 
re-create the Hive object due to changes in Hive configurations. But the Hive 
objects created by query execution threads should be closed when the thread 
exits.

So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
object which should be set to *false* for *sessionHive* and would be forcefully 
closed when the session is closed or released.

Also, when we reset *sessionHive* object with new one due to changes in 
*sessionConf*, the old one should be closed when no async thread is referring 
to it. This can be done using "*finalize*" method of Hive object where we can 
close HMS connection when Hive object is garbage collected.

cc [~pvary]

  was:
*Problem description:*

The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
class when we open a new session for a client connection and by default all 
queries from this connection shares the same sessionHive object. 

If the master thread executes a *synchronous* query, it closes the sessionHive 
object (referred via thread local hiveDb) if  {{Hive.isCompatible}} returns 
false and sets new Hive object in thread local HiveDb but doesn't change the 
sessionHive object in the session. Whereas, *asynchronous* query execution via 
async threads never closes the sessionHive object and it just creates a new one 
if needed and sets it as their thread local hiveDb.

So, the problem can happen in the case where an *asynchronous* query is being 
executed by async threads refers to sessionHive object and the master thread 
receives a *synchronous* query that closes the same sessionHive object. 

Also, each query execution overwrites the thread local hiveDb object to 
sessionHive object which potentially leaks a metastore connection if the 
previous synchronous query execution re-created the Hive object.

*Possible Fix:*

The *sessionHive* object could be shared my multiple threads and so it 
shouldn't be allowed to be closed by any query execution threads when they 
re-create the Hive object due to changes in Hive configurations. But the Hive 
objects created by query execution threads should be closed when the thread 
exits.

So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
object which should be set to *false* for *sessionHive* and would be forcefully 
closed when the session is closed or released.

Also, when we reset *sessionHive* object with new one due to changes in 
*sessionConf*, the old one should be closed when no async thread is referring 
to it. This can be done using "finalize" method of Hive object where we can 
close HMS connection when Hive object is garbage collected.

cc [~pvary]


> Async query execution can potentially fail if shared sessionHive is closed by 
> master thread.
> --------------------------------------------------------------------------------------------
>
>                 Key: HIVE-20682
>                 URL: https://issues.apache.org/jira/browse/HIVE-20682
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 3.1.0, 4.0.0
>            Reporter: Sankar Hariappan
>            Assignee: Sankar Hariappan
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-20682.01.patch, HIVE-20682.02.patch, 
> HIVE-20682.03.patch, HIVE-20682.04.patch
>
>
> *Problem description:*
> The master thread initializes the *sessionHive* object in *HiveSessionImpl* 
> class when we open a new session for a client connection and by default all 
> queries from this connection shares the same sessionHive object. 
> If the master thread executes a *synchronous* query, it closes the 
> sessionHive object (referred via thread local hiveDb) if  
> {{Hive.isCompatible}} returns false and sets new Hive object in thread local 
> HiveDb but doesn't change the sessionHive object in the session. Whereas, 
> *asynchronous* query execution via async threads never closes the sessionHive 
> object and it just creates a new one if needed and sets it as their thread 
> local hiveDb.
> So, the problem can happen in the case where an *asynchronous* query is being 
> executed by async threads refers to sessionHive object and the master thread 
> receives a *synchronous* query that closes the same sessionHive object. 
> Also, each query execution overwrites the thread local hiveDb object to 
> sessionHive object which potentially leaks a metastore connection if the 
> previous synchronous query execution re-created the Hive object.
> *Possible Fix:*
> The *sessionHive* object could be shared my multiple threads and so it 
> shouldn't be allowed to be closed by any query execution threads when they 
> re-create the Hive object due to changes in Hive configurations. But the Hive 
> objects created by query execution threads should be closed when the thread 
> exits.
> So, it is proposed to have an *isAllowClose* flag (default: *true*) in Hive 
> object which should be set to *false* for *sessionHive* and would be 
> forcefully closed when the session is closed or released.
> Also, when we reset *sessionHive* object with new one due to changes in 
> *sessionConf*, the old one should be closed when no async thread is referring 
> to it. This can be done using "*finalize*" method of Hive object where we can 
> close HMS connection when Hive object is garbage collected.
> cc [~pvary]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to