[
https://issues.apache.org/jira/browse/IMPALA-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joe McDonnell resolved IMPALA-13898.
------------------------------------
Fix Version/s: Impala 5.0.0
Resolution: Fixed
> Tuple cache produces incorrect result when querying
> scale_db.num_partitions_1234_blocks_per_partition_1
> -------------------------------------------------------------------------------------------------------
>
> Key: IMPALA-13898
> URL: https://issues.apache.org/jira/browse/IMPALA-13898
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Critical
> Fix For: Impala 5.0.0
>
>
> Tuple caching generates the same key for these two queries:
> {noformat}
> select * from scale_db.num_partitions_1234_blocks_per_partition_1 where j=1
> select * from scale_db.num_partitions_1234_blocks_per_partition_1 where j=1
> or j=2;{noformat}
> This is a scenario from catalog_service/test_large_num_partitions.py. It is a
> correctness issue.
> scale_db.num_partitions_1234_blocks_per_partition_1 is an exotic table where
> all of the partitions point to the same location / file. It also only has
> partition columns, so the contents of the file don't matter. This means that
> j=1 and j=2 both point to the same file. The partition information is not
> included in the key, so the two are indistinguishable. We'll need to expand
> what we put in the cache key to handle this scenario.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]