[ 
https://issues.apache.org/jira/browse/HIVE-20517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608739#comment-16608739
 ] 

ASF GitHub Bot commented on HIVE-20517:
---------------------------------------

GitHub user maheshk114 opened a pull request:

    https://github.com/apache/hive/pull/430

    HIVE-20517 : Creation of staging directory and Move operation is taking 
time in S3

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maheshk114/hive HIVE-20517

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/hive/pull/430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #430
    
----
commit 75c506dcd2824744a6e84c703c47742c43e37fd5
Author: Mahesh Kumar Behera <mbehera@...>
Date:   2018-09-08T03:39:37Z

    HIVE-20517 : Creation of staging directory and Move operation is taking 
time in S3

----


> Creation of staging directory and Move operation is taking time in S3
> ---------------------------------------------------------------------
>
>                 Key: HIVE-20517
>                 URL: https://issues.apache.org/jira/browse/HIVE-20517
>             Project: Hive
>          Issue Type: Bug
>          Components: repl
>    Affects Versions: 4.0.0
>            Reporter: mahesh kumar behera
>            Assignee: mahesh kumar behera
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20517.01.patch
>
>
> Operations like insert and add partition creates a staging directory to 
> generate the files and then move the files created to actual location. In 
> replication flow, the files are first copied to the staging directory and 
> then moved (rename) to the actual table location. In case of S3, move is not 
> an atomic operation. It internally does a copy and delete. So it can not 
> guarantee the consistency required. So it is better to copy the files 
> directly to the actual location. This will help in avoiding the staging 
> directory creation (which takes 1-2 seconds in s3) and move (which takes time 
> proportional to file size).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to