[ 
https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973353#action_12973353
 ] 

Joydeep Sen Sarma commented on HIVE-1852:
-----------------------------------------

cool - the fsshell removal sounds good unless Yongqiang says something 
otherwise.

i am pretty sure this patch breaks load command with a wildcard though. it 
seems to me that the load command is simply passing the input path (with the 
wildcard pattern) to the the loadTable/loadPartition methods (via 
LoadTableDesc). these commands were previously capable of handling wildcards 
that matched a set of files. now they will not be able to do that. Ning - can u 
confirm this? (maybe add a test trying to load a wildcard pattern?)

on a more minor note - the checkPaths call that got taken out was checking for 
the presence of nested subdirectories inside the path being loaded. is this no 
longer necessary? (do we support directories within partitions/tables 
automatically at query time?)

> Reduce unnecessary DFSClient.rename() calls
> -------------------------------------------
>
>                 Key: HIVE-1852
>                 URL: https://issues.apache.org/jira/browse/HIVE-1852
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>         Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.patch
>
>
> In Hive client side (MoveTask etc), DFSCleint.rename() is called for every 
> file inside a directory. It is very expensive for a large directory in a busy 
> DFS namenode. We should replace it with a single rename() call on the whole 
> directory. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to