[ https://issues.apache.org/jira/browse/HIVE-28530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17883028#comment-17883028 ]
Xiaomin Zhang commented on HIVE-28530: -------------------------------------- Issue seems related to below jira: https://issues.apache.org/jira/browse/HIVE-21279 In this jira, a new HiveSequenceFileInputFormat was introduced and it has a volatile field fileStatuses, which is referenced twice in the FetchOperator by getNextSplits() call, only when query.result.cache is disabled. Unfortunately this field access is not thread-safe because the HiveSequenceFileInputFormat object itself is actually a shared object. Due to this, there could be various failing scenarios such like: 1) One thread set fileStatuses to null, another thread overrides it to its result files ==> getting a wrong result from another query 2) One thread set fileStatuses to its result files, then another thread overrides it to null ==> getting empty result > Fetched result from another query > --------------------------------- > > Key: HIVE-28530 > URL: https://issues.apache.org/jira/browse/HIVE-28530 > Project: Hive > Issue Type: Bug > Security Level: Public(Viewable by anyone) > Components: HiveServer2 > Affects Versions: 3.0.0 > Reporter: Xiaomin Zhang > Priority: Major > > When running Hive load tests, we observed Beeline can fetch wrong query > result which is from another one running at same time. We ruled out Load > Balancing issue, because it happened to a single HiveServer2. And we found > this issue only happens when *hive.query.result.cached.enabled is false.* > All test queries are in the same format as below: > {code:java} > select concat('total record (test_recon_mock_$PID)=',count(*)) as > count_record from t1t > {code} > We randomized the query by replacing the $PID with the Beeline PID and the > test driver ran 10 Beeline concurrently. The table t1t is static and has a > few rows. So now the test driver can check if the query result is equal to: > total record (test_recon_mock_$PID)=2 > When query result cache is disabled, we can see randomly query got a wrong > result, and can always reproduced. For example, below two queries were > running in parallel: > {code:java} > queryId=hive_20240701103742_ff1adb2d-e9eb-448d-990e-00ab371e9db6): select > concat('total record (test_recon_mock_21535)=',count(*)) as count_record from > t1t > queryId=hive_20240701103742_9bdfff92-89e1-4bcd-88ea-bf73ba5fd93d): select > concat('total record (test_recon_mock_21566)=',count(*)) as count_record from > t1t > {code} > While the second query is supposed to get below result: > *total record (test_recon_mock_21566)=2* > But actually Beeline got below result: > *total record (test_recon_mock_21535)=2* > There is no error in the HS2 log. -- This message was sent by Atlassian Jira (v8.20.10#820010)