Rajesh Balamohan created HIVE-27014: ---------------------------------------
Summary: Iceberg: getSplits/planTasks should filter out relevant folders instead of scanning entire table Key: HIVE-27014 URL: https://issues.apache.org/jira/browse/HIVE-27014 Project: Hive Issue Type: Improvement Components: Iceberg integration Reporter: Rajesh Balamohan With dynamic partition pruning, only relevant folders in fact tables are scanned. In tez, DynamicPartitionPruner will set the relevant filters.In iceberg, these filters are applied after "Table:planTasks()" is invoked in iceberg. This forces entire table metadata to be scanned and then throw off the unwanted partitions. This makes split computation expensive (e.g for store_sales, it has to look at all 1800+ partitions and throw off unwanted partitions). For short running queries, it takes 3-5+ seconds for split computation. Creating this ticket as a placeholder to make use of the relevant filters from DPP. -- This message was sent by Atlassian Jira (v8.20.10#820010)