Thanks Sumanth, so it seems it cannot be done without an intermediary step using an extra table.
On 2011/09/18, at 04:56, Sumanth V wrote: > Hi Adriaan, > > To use dynamic partition, follow the following steps inside hive shell - > > #Set the following values - > > set hive.exec.dynamic.partition.mode=nonstrict; > > set hive.exec.dynamic.partition=true > > #Create another table - > > create table raw_2 > ( > data string > ) > partitioned by (partition1 string, partition2 string); > > #Now insert the values stored in table raw into table raw_2 using the > following query - > > from raw > insert overwrite table raw_2 partition (partition1, partition2) > select data, partition1, partition2; > > This will dynamically create the 2 partitions based on the values of > partition1 and partition2 and insert the values of 'data' in the appropriate > partition. > > Regards, > Sumanth > > > > On Sat, Sep 17, 2011 at 2:18 PM, Adriaan Tijsseling > <[email protected]>wrote: > >> Hi, >> >> I have a table created with >> >> CREATE TABLE raw(partition1 string, partition2 string, data string) ROW >> FORMAT DELIMITED FIELDS TERMINATED BY '\001' STORED AS TEXTFILE; >> >> I want to further process "data" and put it in a partition (partition1, >> partition2) defined by the values in the relevant row. >> >> I'm however stuck at trying to use dynamic partitions in a query. With >> predefined partition values it's straightforward: >> >> FROM ( >> FROM raw >> SELECT TRANSFORM(raw.data) >> USING 'python parser.py' AS (foo STRING, date STRING, bar >> MAP<STRING,STRING>) >> CLUSTER BY date >> ) tmap >> INSERT OVERWRITE TABLE polished PARTITION (partition1='p1', >> partition2='p2') SELECT foo, date, bar; >> >> What would be the best way to define the partition using raw.partition1 and >> raw.partition2 as values? >> >> Thanks much, >> >> Adriaan >> >>
