[jira] [Commented] (HIVE-6589) Automatically add partitions for external tables

Ilkka Peltola (JIRA) Thu, 09 Aug 2018 23:03:44 -0700


    [ 
https://issues.apache.org/jira/browse/HIVE-6589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16575784#comment-16575784
 ]


Ilkka Peltola commented on HIVE-6589:
-------------------------------------

The U-SQL language has [an example 
implementation|https://msdn.microsoft.com/en-us/azure/data-lake-analytics/u-sql/input-files-u-sql#inputfilesetpath]
 of this. It might be a reference implementation if that makes it easier to 
start on this?

I've used the U-SQL (or rather SCOPE, which was before) and really miss that 
functionality in Hive. It makes loading data into external tables and 
partitioning them a great deal easier. So often data is in the format 
folder/year/month/day/file.ext.

> Automatically add partitions for external tables
> ------------------------------------------------
>
>                 Key: HIVE-6589
>                 URL: https://issues.apache.org/jira/browse/HIVE-6589
>             Project: Hive
>          Issue Type: New Feature
>    Affects Versions: 0.14.0
>            Reporter: Ken Dallmeyer
>            Assignee: Dharmendra Pratap Singh
>            Priority: Major
>
> I have a data stream being loaded into Hadoop via Flume. It loads into a date 
> partition folder in HDFS.  The path looks like this:
> {code}/flume/my_data/YYYY/MM/DD/HH
> /flume/my_data/2014/03/02/01
> /flume/my_data/2014/03/02/02
> /flume/my_data/2014/03/02/03{code}
> On top of it I create an EXTERNAL hive table to do querying.  As of now, I 
> have to manually add partitions.  What I want is for EXTERNAL tables, Hive 
> should "discover" those partitions.  Additionally I would like to specify a 
> partition pattern so that when I query Hive will know to use the partition 
> pattern to find the HDFS folder.
> So something like this:
> {code}CREATE EXTERNAL TABLE my_data (
>   col1 STRING,
>   col2 INT
> )
> PARTITIONED BY (
>   dt STRING,
>   hour STRING
> )
> LOCATION 
>   '/flume/mydata'
> TBLPROPERTIES (
>   'hive.partition.spec' = 'dt=$Y-$M-$D, hour=$H',
>   'hive.partition.spec.location' = '$Y/$M/$D/$H',
> );
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (HIVE-6589) Automatically add partitions for external tables

Reply via email to