[ 
https://issues.apache.org/jira/browse/HIVE-26110?focusedWorklogId=752340&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-752340
 ]

ASF GitHub Bot logged work on HIVE-26110:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 04/Apr/22 16:04
            Start Date: 04/Apr/22 16:04
    Worklog Time Spent: 10m 
      Work Description: szlta opened a new pull request, #3174:
URL: https://github.com/apache/hive/pull/3174

   Bulk insert into partitioned table creates lots of files in iceberg, because 
the SortedDynPartitionOptimizer doesn't set the key->reducer affinity that 
could be done by just marking the sort expressions as 'partition' columns.




Issue Time Tracking
-------------------

            Worklog Id:     (was: 752340)
    Remaining Estimate: 0h
            Time Spent: 10m

> bulk insert into partitioned table creates lots of files in iceberg
> -------------------------------------------------------------------
>
>                 Key: HIVE-26110
>                 URL: https://issues.apache.org/jira/browse/HIVE-26110
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Priority: Major
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> For e.g, create web_returns table in tpcds in iceberg format and try to copy 
> over data from regular table. More like "insert into web_returns_iceberg as 
> select * from web_returns".
> This inserts the data correctly, however there are lot of files present in 
> each partition. IMO, dynamic sort optimisation isn't working fine and this 
> causes records not to be grouped in the final phase.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to