mahesh kumar behera created HIVE-20517:
------------------------------------------

             Summary: Creation of staging directory and Move operation is 
taking time in S3
                 Key: HIVE-20517
                 URL: https://issues.apache.org/jira/browse/HIVE-20517
             Project: Hive
          Issue Type: Bug
          Components: repl
    Affects Versions: 4.0.0
            Reporter: mahesh kumar behera
            Assignee: mahesh kumar behera
             Fix For: 4.0.0


Operations like insert and add partition creates a staging directory to 
generate the files and then move the files created to actual location. In 
replication flow, the files are first copied to the staging directory and then 
moved (rename) to the actual table location. In case of S3, move is not an 
atomic operation. It internally does a copy and delete. So it can not guarantee 
the consistency required. So it is better to copy the files directly to the 
actual location. This will help in avoiding the staging directory creation 
(which takes 1-2 seconds in s3) and move (which takes time proportional to file 
size).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to