Hi Ameet,
That's the correct behaviour.

In Hive, clustering and sorting happens within a partition. Inside each 
partition, there is only one value associated with the partition column 
therefore it would have no impact on clustering and sorting. Therefore, putting 
the partition column in clustered by/sorted by in the create table statement 
doesn't make sense.

Also, your create table statement should be something like (note the removal of 
col3 from the column list):
create table abc ( col1 string, col2 string) 
partitioned by (col3 string) 
clustered by (col1) sorted by (col1) into 10 buckets; 

Mark

Mark Grover, Business Intelligence Analyst
OANDA Corporation 

www: oanda.com www: fxtrade.com 

----- Original Message -----
From: "ameet chaubal" <ameetchau...@yahoo.com>
To: user@hive.apache.org
Sent: Thursday, May 10, 2012 2:24:02 PM
Subject: partition column not allowed in clustered by clause ?



Hi All, 


I am not able to create a table with partition column also included in the 
clustered by clause. 
create table abc ( col1, col2, col3 ) 
partitioned by (col3) 
clustered by (col1,col3) sorted by (col1,col3) into 10 buckets; 


fails with : FAILED: Error in semantic analysis: Invalid column reference 


Any reason why this is the case? 

Sincerely, 


Ameet 

Reply via email to