[ 
https://issues.apache.org/jira/browse/IMPALA-11209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956420#comment-17956420
 ] 

Michael Smith commented on IMPALA-11209:
----------------------------------------

I reproduced this after reverting IMPALA-13303. It does appear to be fixed by 
IMPALA-13303.

> Inconsistent results querying tables with subdirectories
> --------------------------------------------------------
>
>                 Key: IMPALA-11209
>                 URL: https://issues.apache.org/jira/browse/IMPALA-11209
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Catalog, Frontend
>            Reporter: Miklos Szurap
>            Priority: Major
>
> IMPALA-8454 introduced the recursive listing of table/partition directories. 
> It seems that it is not properly handling if we would like to intentionally 
> disable this new behavior through the impala.disable.recursive.listing=true 
> table property. Within the same session a refresh statement on the table 
> flaps the behavior, see below reproduction steps.
> {code}
> CREATE EXTERNAL TABLE subdirtest (col1 string) partitioned by (p1 string) 
> TBLPROPERTIES ('impala.disable.recursive.listing'='true');
> ALTER TABLE subdirtest ADD PARTITION (p1='A');
> {code}
> then ingest some files into subdirectories
> {code}
> hdfs dfs -mkdir /warehouse/tablespace/external/hive/subdirtest/p1=A/00
> hdfs dfs -put testdata.parq 
> /warehouse/tablespace/external/hive/subdirtest/p1=A/00/
> {code}
> The "testdata.parq" matches the schema, and has two rows/records.
> {code}
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 0        |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 2        |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 0        |
> +----------+
> [coordinator.example.com:21050] default> refresh subdirtest;
> ...
> [coordinator.example.com:21050] default> select count(*) from subdirtest;
> +----------+
> | count(*) |
> +----------+
> | 2        |
> +----------+
> {code}
> This can be reproduced within the same / single impala-shell session (without 
> any other coordinators or load-balancing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to