[
https://issues.apache.org/jira/browse/IMPALA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Smith closed IMPALA-11209.
----------------------------------
Resolution: Duplicate
> Inconsistent results querying tables with subdirectories
> --------------------------------------------------------
>
> Key: IMPALA-11209
> URL: https://issues.apache.org/jira/browse/IMPALA-11209
> Project: IMPALA
> Issue Type: Bug
> Components: Catalog, Frontend
> Reporter: Miklos Szurap
> Priority: Major
>
> IMPALA-8454 introduced the recursive listing of table/partition directories.
> It seems that it is not properly handling if we would like to intentionally
> disable this new behavior through the impala.disable.recursive.listing=true
> table property. Within the same session a refresh statement on the table
> flaps the behavior, see below reproduction steps.
> {code}
> CREATE EXTERNAL TABLE subdirtest (col1 string) partitioned by (p1 string)
> TBLPROPERTIES ('impala.disable.recursive.listing'='true');
> ALTER TABLE subdirtest ADD PARTITION (p1='A');
> {code}
> then ingest some files into subdirectories
> {code}
> hdfs dfs -mkdir /warehouse/tablespace/external/hive/subdirtest/p1=A/00
> hdfs dfs -put testdata.parq
> /warehouse/tablespace/external/hive/subdirtest/p1=A/00/
> {code}
> The "testdata.parq" matches the schema, and has two rows/records.
> {code}
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 0 |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 2 |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 0 |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 2 |
> +----------+
> {code}
> This can be reproduced within the same / single impala-shell session (without
> any other coordinators or load-balancing).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]