[ https://issues.apache.org/jira/browse/HIVE-20517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608739#comment-16608739 ]
ASF GitHub Bot commented on HIVE-20517: --------------------------------------- GitHub user maheshk114 opened a pull request: https://github.com/apache/hive/pull/430 HIVE-20517 : Creation of staging directory and Move operation is taking time in S3 You can merge this pull request into a Git repository by running: $ git pull https://github.com/maheshk114/hive HIVE-20517 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/hive/pull/430.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #430 ---- commit 75c506dcd2824744a6e84c703c47742c43e37fd5 Author: Mahesh Kumar Behera <mbehera@...> Date: 2018-09-08T03:39:37Z HIVE-20517 : Creation of staging directory and Move operation is taking time in S3 ---- > Creation of staging directory and Move operation is taking time in S3 > --------------------------------------------------------------------- > > Key: HIVE-20517 > URL: https://issues.apache.org/jira/browse/HIVE-20517 > Project: Hive > Issue Type: Bug > Components: repl > Affects Versions: 4.0.0 > Reporter: mahesh kumar behera > Assignee: mahesh kumar behera > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-20517.01.patch > > > Operations like insert and add partition creates a staging directory to > generate the files and then move the files created to actual location. In > replication flow, the files are first copied to the staging directory and > then moved (rename) to the actual table location. In case of S3, move is not > an atomic operation. It internally does a copy and delete. So it can not > guarantee the consistency required. So it is better to copy the files > directly to the actual location. This will help in avoiding the staging > directory creation (which takes 1-2 seconds in s3) and move (which takes time > proportional to file size). -- This message was sent by Atlassian JIRA (v7.6.3#76005)