[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

Steve Loughran (JIRA) Fri, 22 Jul 2016 11:18:01 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15389975#comment-15389975
 ]


Steve Loughran commented on HIVE-14270:
---------------------------------------

Regarding S3 testing, have a look at what hadoop-aws does, where we scan for a 
specific configuration file declaring the s3 connection details, and only then 
connect to s3. SPARK-7481 adds a new spark-cloud module for cloud specific 
tests, again skipping tests if there is no binding.

Hadoop also has an enforced "declare what endpoint you tested against" policy, 
which keeps submitters honest and means reviewers get to review functionality, 
not deal with code that doesn't work. Recommended.
https://wiki.apache.org/hadoop/HowToContribute#Submitting_patches_against_object_stores_such_as_Amazon_S3.2C_OpenStack_Swift_and_Microsoft_Azure

> Write temporary data to HDFS when doing inserts on tables located on S3
> -----------------------------------------------------------------------
>
>                 Key: HIVE-14270
>                 URL: https://issues.apache.org/jira/browse/HIVE-14270
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>         Attachments: HIVE-14270.1.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

Reply via email to