[ https://issues.apache.org/jira/browse/HIVE-26110?focusedWorklogId=752340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752340 ]
ASF GitHub Bot logged work on HIVE-26110: ----------------------------------------- Author: ASF GitHub Bot Created on: 04/Apr/22 16:04 Start Date: 04/Apr/22 16:04 Worklog Time Spent: 10m Work Description: szlta opened a new pull request, #3174: URL: https://github.com/apache/hive/pull/3174 Bulk insert into partitioned table creates lots of files in iceberg, because the SortedDynPartitionOptimizer doesn't set the key->reducer affinity that could be done by just marking the sort expressions as 'partition' columns. Issue Time Tracking ------------------- Worklog Id: (was: 752340) Remaining Estimate: 0h Time Spent: 10m > bulk insert into partitioned table creates lots of files in iceberg > ------------------------------------------------------------------- > > Key: HIVE-26110 > URL: https://issues.apache.org/jira/browse/HIVE-26110 > Project: Hive > Issue Type: Bug > Reporter: Rajesh Balamohan > Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > For e.g, create web_returns table in tpcds in iceberg format and try to copy > over data from regular table. More like "insert into web_returns_iceberg as > select * from web_returns". > This inserts the data correctly, however there are lot of files present in > each partition. IMO, dynamic sort optimisation isn't working fine and this > causes records not to be grouped in the final phase. -- This message was sent by Atlassian Jira (v8.20.1#820001)