[ https://issues.apache.org/jira/browse/HIVE-1852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12973353#action_12973353 ]
Joydeep Sen Sarma commented on HIVE-1852: ----------------------------------------- cool - the fsshell removal sounds good unless Yongqiang says something otherwise. i am pretty sure this patch breaks load command with a wildcard though. it seems to me that the load command is simply passing the input path (with the wildcard pattern) to the the loadTable/loadPartition methods (via LoadTableDesc). these commands were previously capable of handling wildcards that matched a set of files. now they will not be able to do that. Ning - can u confirm this? (maybe add a test trying to load a wildcard pattern?) on a more minor note - the checkPaths call that got taken out was checking for the presence of nested subdirectories inside the path being loaded. is this no longer necessary? (do we support directories within partitions/tables automatically at query time?) > Reduce unnecessary DFSClient.rename() calls > ------------------------------------------- > > Key: HIVE-1852 > URL: https://issues.apache.org/jira/browse/HIVE-1852 > Project: Hive > Issue Type: Improvement > Reporter: Ning Zhang > Assignee: Ning Zhang > Attachments: HIVE-1852.2.patch, HIVE-1852.3.patch, HIVE-1852.patch > > > In Hive client side (MoveTask etc), DFSCleint.rename() is called for every > file inside a directory. It is very expensive for a large directory in a busy > DFS namenode. We should replace it with a single rename() call on the whole > directory. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.