[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

Steve Loughran (JIRA) Wed, 27 Jul 2016 03:40:42 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15395413#comment-15395413
 ]


Steve Loughran commented on HIVE-14270:
---------------------------------------

Hadoop's FS API is a plugin point: there are more object stores than Hadoop 
itself ships with. I think of google GFS in particular.

I've started on HADOOP-9565 again; this would let you check by saying {{if 
(filesystem instanceof ObjectStore) { ..... } )), though you'd be writing code 
which only worked for up to date hadoop versions.

Anyway, I agree with your point that a config should really go into Hadoop, 
because it should be relevant to more applications. We'd just need to make it 
something with a default ("s3, s3n, s3a, swift, gfs") which things built to run 
against older hadoop versions can use and not get confused on. Not sure about 
S3 though, because AWS s3:// != ASF s3:// 

> Write temporary data to HDFS when doing inserts on tables located on S3
> -----------------------------------------------------------------------
>
>                 Key: HIVE-14270
>                 URL: https://issues.apache.org/jira/browse/HIVE-14270
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>         Attachments: HIVE-14270.1.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14270) Write temporary data to HDFS when doing inserts on tables located on S3

Reply via email to