[ https://issues.apache.org/jira/browse/HIVE-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy Fingerman updated HIVE-28266: ------------------------------------- Summary: Iceberg: select count(*) from data_files metadata tables gives wrong result (was: Iceberg: select count(*) from *.data_files metadata tables gives wrong result) > Iceberg: select count(*) from data_files metadata tables gives wrong result > --------------------------------------------------------------------------- > > Key: HIVE-28266 > URL: https://issues.apache.org/jira/browse/HIVE-28266 > Project: Hive > Issue Type: Bug > Reporter: Dmitriy Fingerman > Assignee: Dmitriy Fingerman > Priority: Major > > In Hive Iceberg, every table has a corresponding metadata table > "*.data_files" that contains info about the files that contain table's data. > select count(*) from a data_file metadata table returns number of rows in the > data table instead of number of data files from the metadata table. > > {code:java} > CREATE TABLE x (name VARCHAR(50), age TINYINT, num_clicks BIGINT) stored by > iceberg stored as orc TBLPROPERTIES > ('external.table.purge'='true','format-version'='2'); > insert into x values > ('amy', 35, 123412344), > ('adxfvy', 36, 123412534), > ('amsdfyy', 37, 123417234), > ('asafmy', 38, 123412534); > insert into x values > ('amerqwy', 39, 123441234), > ('amyxzcv', 40, 123341234), > ('erweramy', 45, 122341234); > Select * from default.x.data_files; > – Returns 2 records in the output > Select count from default.x.data_files; > – Returns 7 instead of 2 > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)