[ 
https://issues.apache.org/jira/browse/HIVE-22548?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16987093#comment-16987093
 ] 

Steve Loughran commented on HIVE-22548:
---------------------------------------

do you need that return code from removeEmptyDpDirectory()? As you are still 
doing listStatus calls which you can avoid...if you just replaced the entire 
function with delete(path, false) then only an empty dir will be deleted, so 
you save the cost of a listing

> Optimise Utilities.removeTempOrDuplicateFiles when moving files to final 
> location
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-22548
>                 URL: https://issues.apache.org/jira/browse/HIVE-22548
>             Project: Hive
>          Issue Type: Improvement
>          Components: Hive
>    Affects Versions: 3.1.2
>            Reporter: Rajesh Balamohan
>            Assignee: mahesh kumar behera
>            Priority: Major
>         Attachments: HIVE-22548.01.patch
>
>
> {{Utilities.removeTempOrDuplicateFiles}}
> is very slow with cloud storage, as it executes {{listStatus}} twice and also 
> runs in single threaded mode.
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java#L1629



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to