Sergio Peña created HIVE-11940:
----------------------------------

             Summary: "INSERT OVERWRITE" query is very slow because it creates 
one "distcp" per file to copy data from staging directory to target directory
                 Key: HIVE-11940
                 URL: https://issues.apache.org/jira/browse/HIVE-11940
             Project: Hive
          Issue Type: Bug
    Affects Versions: 1.2.1
            Reporter: Sergio Peña
            Assignee: Sergio Peña


When hive.exec.stagingdir is set to ".hive-staging", which will be placed under 
the target directory when running "INSERT OVERWRITE" query, Hive will grab all 
files under the staging directory and copy them ONE BY ONE to target directory.

When hive exec.stagingdir is set to "/tmp/hive", Hive will simply do a RENAME 
operation which will be instant.

This happens with files that are not encrypted. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to