[ 
https://issues.apache.org/jira/browse/HIVE-28530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883028#comment-17883028
 ] 

Xiaomin Zhang commented on HIVE-28530:
--------------------------------------

Issue seems related to below jira:

https://issues.apache.org/jira/browse/HIVE-21279

In this jira, a new HiveSequenceFileInputFormat was introduced and it has a 
volatile field fileStatuses, which is referenced twice in the FetchOperator by 
getNextSplits() call, only when query.result.cache is disabled. Unfortunately 
this field access is not thread-safe because the HiveSequenceFileInputFormat 
object itself is actually a shared object. Due to this, there could be various 
failing scenarios such like:

1) One thread set fileStatuses to null, another thread overrides it to its 
result files ==> getting a wrong result from another query

2) One thread set fileStatuses to its result files, then another thread 
overrides it to null ==> getting empty result

> Fetched result from another query
> ---------------------------------
>
>                 Key: HIVE-28530
>                 URL: https://issues.apache.org/jira/browse/HIVE-28530
>             Project: Hive
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>          Components: HiveServer2
>    Affects Versions: 3.0.0
>            Reporter: Xiaomin Zhang
>            Priority: Major
>
> When running Hive load tests, we observed Beeline can fetch wrong query 
> result which is from another one running at same time.  We ruled out Load 
> Balancing issue, because it happened to a single HiveServer2.  And we found 
> this issue only happens when *hive.query.result.cached.enabled is false.*
> All test queries are in the same format as below: 
> {code:java}
> select concat('total record (test_recon_mock_$PID)=',count(*)) as 
> count_record from t1t
> {code}
> We randomized the query by replacing the $PID with the Beeline PID and the 
> test driver ran 10 Beeline concurrently.  The table t1t is static and has a 
> few rows. So now the test driver can check if the query result is equal to: 
> total record (test_recon_mock_$PID)=2
> When query result cache is disabled,  we can see randomly query got a wrong 
> result, and can always reproduced.  For example, below two queries were 
> running in parallel:
> {code:java}
> queryId=hive_20240701103742_ff1adb2d-e9eb-448d-990e-00ab371e9db6): select 
> concat('total record (test_recon_mock_21535)=',count(*)) as count_record from 
> t1t
> queryId=hive_20240701103742_9bdfff92-89e1-4bcd-88ea-bf73ba5fd93d): select 
> concat('total record (test_recon_mock_21566)=',count(*)) as count_record from 
> t1t
> {code}
> While the second query is supposed to get below result:
> *total record (test_recon_mock_21566)=2*
> But actually Beeline got below result:
> *total record (test_recon_mock_21535)=2*
> There is no error in the HS2 log.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to