[ 
https://issues.apache.org/jira/browse/HIVE-14270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15418580#comment-15418580
 ] 

Lefty Leverenz commented on HIVE-14270:
---------------------------------------

bq.  ... wiki section about S3 or blobstore tables?

The wiki doesn't mention blobstore tables.
S3 is mentioned in a few wikidocs:

* [Hive and Amazon Web Services | 
https://cwiki.apache.org/confluence/display/Hive/HiveAws]
* [Hive on Amazon Elastic MapReduce -- Hive S3 Tables | 
https://cwiki.apache.org/confluence/display/Hive/HiveAmazonElasticMapReduce#Hive
 S3 Tables]
* [HiveAws HivingS3nRemotely | 
https://cwiki.apache.org/confluence/display/Hive/HiveAws+HivingS3nRemotely] 
(orphan doc:  no links from other wikidocs)
* [AdminManual Configuration -- Temporary Folders | 
https://cwiki.apache.org/confluence/display/Hive/AdminManual+Configuration#AdminManualConfiguration-TemporaryFolders]

(Note that the AWS and EMR docs haven't been updated since 2011.)

You could also document this with the INSERT information:

* [LanguageManual DML -- Inserting data into Hive Tables from queries | 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML#LanguageManualDML-InsertingdataintoHiveTablesfromqueries]

The new configuration parameters belong here:

* [Hive Configuration Properties | 
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties]

> Write temporary data to HDFS when doing inserts on tables located on S3
> -----------------------------------------------------------------------
>
>                 Key: HIVE-14270
>                 URL: https://issues.apache.org/jira/browse/HIVE-14270
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: Sergio Peña
>            Assignee: Sergio Peña
>              Labels: TODOC2.2
>             Fix For: 2.2.0
>
>         Attachments: HIVE-14270.1.patch, HIVE-14270.2.patch, 
> HIVE-14270.3.patch, HIVE-14270.4.patch, HIVE-14270.5.patch, HIVE-14270.6.patch
>
>
> Currently, when doing INSERT statements on tables located at S3, Hive writes 
> and reads temporary (or intermediate) files to S3 as well. 
> If HDFS is still the default filesystem on Hive, then we can keep such 
> temporary files on HDFS to keep things run faster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to