I have some avro data files partitioned by date such as: /items/yyyy/mm/dd/part-r-00000.avro
I want to create a Hive table on this data, so I have a create table statement: CREATE EXTERNAL TABLE items PARTITIONED BY (year STRING, month STRING, day STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/items' TBLPROPERTIES ( 'avro.schema.url' = 'file://${schemabase}/Item.avsc' ); The item schema is: { "type" : "record", "name" : "Item", "fields" : [ { "name" : "year", "type" : "string" }, { "name" : "month", "type" : "string" }, { "name" : "day", "type" : "string" }, { "name" : "hour", "type" : "string" }, { "name" : "dt", "type" : "long" }, { "name" : "user_agent", "type" : "string" }, { "name" : "ip", "type" : "string" } ] } The problem is that I get an error creating the table: FAILED: Error in metadata: org.apache.hadoop.hive.ql.metadata.HiveException: Partition column name year conflicts with table columns. 14/01/27 03:08:12 ERROR exec.Task: FAILED: Error in metadata: org.apache.hadoop.hive.ql.metadata.HiveException: Partition column name year conflicts with table columns. org.apache.hadoop.hive.ql.metadata.HiveException: org.apache.hadoop.hive.ql.metadata.HiveException: Partition column name year conflicts with table columns. at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:582) at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:3719) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:254) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:138) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:66) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1383) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1169) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:982) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:902) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:259) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:216) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:412) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:347) at org.apache.hadoop.hive.cli.CliDriver.processReader(CliDriver.java:445) at org.apache.hadoop.hive.cli.CliDriver.processFile(CliDriver.java:455) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:713) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:613) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:208) Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Partition column name year conflicts with table columns. at org.apache.hadoop.hive.ql.metadata.Table.checkValidity(Table.java:213) at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:560) ... 21 more So, how do I tell Hive to use a column in the Avro schema for partitioning? Thanks in advance, George