[
https://issues.apache.org/jira/browse/IMPALA-14258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18022507#comment-18022507
]
ASF subversion and git services commented on IMPALA-14258:
----------------------------------------------------------
Commit e1896d4bf8d9568c277e99b664fa9abe8d4f6271 in impala's branch
refs/heads/master from Joe McDonnell
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=e1896d4bf ]
IMPALA-14258: Disable tuple caching for Full Hive ACID tables
TestAcidRowValidation.test_row_validation fails with tuple caching
correction verification. The test creates a Full Hive ACID table
with a file using valid write ids, mimicking a streaming ingest.
As the valid write ids change, the scan of that file produces
different rows without the file changing. Tuple caching currently
doesn't understand valid write ids, so this produces incorrect
results.
This marks Full Hive ACID tables as ineligible for caching until
valid write ids can be supported properly. Insert-only tables are
still eligible.
Testing:
- Added test cases to TupleCacheTest
- Ran TestAcidRowValidation.test_row_validation with correctness
verification
Change-Id: Icab9613b8e2973aed1d34427c51d2fd8b37a9aba
Reviewed-on: http://gerrit.cloudera.org:8080/23454
Reviewed-by: Yida Wu <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Michael Smith <[email protected]>
> TestAcidRowValidation.test_row_validation fails with tuple caching
> ------------------------------------------------------------------
>
> Key: IMPALA-14258
> URL: https://issues.apache.org/jira/browse/IMPALA-14258
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Priority: Critical
>
> When running tuple caching with correctness checking,
> TestAcidRowValidation.test_row_validation fails with a correctness issue:
> {noformat}
> query_test/test_acid_row_validation.py:74: in test_row_validation
> self.run_test_case('QueryTest/acid-row-validation-2', vector,
> use_db=unique_database)
> common/impala_test_suite.py:886: in run_test_case
> result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:816: in __exec_in_impala
> result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:1294: in __execute_query
> return impalad_client.execute(query, user=user)
> common/impala_connection.py:692: in execute
> fetch_exec_summary=fetch_exec_summary, profile_format=profile_format)
> common/impala_connection.py:705: in __fetch_results_and_profile
> profile_format=profile_format)
> common/impala_connection.py:868: in __fetch_results
> result_tuples = cursor.fetchall()
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:624:
> in fetchall
> elements = self._pop_from_buffer(self.buffersize)
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:701:
> in _pop_from_buffer
> self._ensure_buffer_is_filled()
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:683:
> in _ensure_buffer_is_filled
> convert_strings_to_unicode=self.convert_strings_to_unicode)
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:1506:
> in fetch
> resp = self._rpc('FetchResults', req, False)
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:1181:
> in _rpc
> err_if_rpc_not_ok(response)
> /data/jenkins/workspace/impala-private-basic-parameterized/repos/Impala/infra/python/env-gcc10.4.0/lib/python2.7/site-packages/impala/hiveserver2.py:867:
> in err_if_rpc_not_ok
> raise HiveServer2Error(resp.status.errorMessage)
> E HiveServer2Error: Query 82429bc4256e150b:567c70e400000000 failed:
> E Inconsistent tuple cache found: Result '[("a5" "b6")]' of file
> '/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-0/tuple-cache-debug-dump/3c74c05ecd7a27da552d80ec1c68f446_3279139167/82429bc4256e150b:567c70e400000001_2.bad'
> doesn't exist in the reference file:
> '/data/jenkins/workspace/tmp/impala-tuplecache-debugdump-0/tuple-cache-debug-dump/3c74c05ecd7a27da552d80ec1c68f446_3279139167/82429bc4256e150b:567c70e400000001_2_fa45023fc3594f7b:14f9671700000001_2_ref.bad'.{noformat}
> Full ACID has a validWriteIdList that can impact the results for a table even
> without the underlying files changing. Tuple caching will need to handle this
> properly.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]