[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15394679#comment-15394679
 ] 

Sergio Peña commented on HIVE-14270:
------------------------------------

[~poeppt] [~ste...@apache.org] I submitted another patch to the RB. I will skip 
the attachment here to avoid unit tests run for now.

One question:
- Isn't better that Hadoop can return a list of blobstore scheme it supports? I 
think this is better for 2 reasons:
  1. Future versions of hadoop might add other blobstore scheme, and this way 
we will avoid changing Hive but just adding more test coverage.
  2. Other non-hive components may want to get a list of currently supported 
blobstore from hadoop.
I can add the configuration variable, but I was wondering about that.

I'm still working on the testing part to run S3 tests. I'm still thinking 
whether to use q-test or write Junit code. Both have different complications.
I uploaded the code to RB in the meantime so you can help me review. Btw, 
thanks for your help on reviewing it.


> Write temporary data to HDFS when doing inserts on tables located on S3
> -----------------------------------------------------------------------
>
>                 Key: HIVE-14270
>                 URL: https://issues.apache.org/jira/browse/HIVE-14270
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>         Attachments: HIVE-14270.1.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to