Hello all, I tried the following on a build that has the latest HIVE-5783 patch applied over trunk:
hive> set hive.aux.jars.path=file:///usr/lib/hcatalog/share/hcatalog/hcatalog-core.jar,file:///usr/lib/hive/lib/parquet-hadoop-bundle-1.3.2.jar; hive> create table alltypes_parquet stored as parquet as select cint, ctinyint, csmallint, cdouble, cfloat, cstring1 from alltypesorc; hive> show create table alltypes_parquet; OK CREATE TABLE `alltypes_parquet`( `cint` int COMMENT 'from deserializer', `ctinyint` tinyint COMMENT 'from deserializer', `csmallint` smallint COMMENT 'from deserializer', `cdouble` double COMMENT 'from deserializer', `cfloat` float COMMENT 'from deserializer', `cstring1` string COMMENT 'from deserializer') ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' LOCATION 'hdfs://sandbox.hortonworks.com:8020/apps/hive/warehouse/alltypes_parquet' TBLPROPERTIES ( 'numFiles'='1', 'transient_lastDdlTime'='1391609238', 'COLUMN_STATS_ACCURATE'='true', 'totalSize'='256959', 'numRows'='12288', 'rawDataSize'='73728') Time taken: 0.256 seconds, Fetched: 22 row(s) hive> select * from alltypes_parquet where 1=1; ... Error: Caused by: parquet.io.InvalidRecordException: cint not found in message table_schema { } at parquet.schema.GroupType.getFieldIndex(GroupType.java:104) at parquet.schema.GroupType.getType(GroupType.java:136) at org.apache.hadoop.hive.ql.io.parquet.read.DataWritableReadSupport.init(DataWritableReadSupport.java:93) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.getSplit(ParquetRecordReaderWrapper.java:205) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:79) at org.apache.hadoop.hive.ql.io.parquet.read.ParquetRecordReaderWrapper.<init>(ParquetRecordReaderWrapper.java:66) at org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat.getRecordReader(MapredParquetInputFormat.java:51) at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.<init>(CombineHiveRecordReader.java:65) So what am I missing? The catalog info seems at odds with the record structure after CREATE TABLE. Thanks, ~Remus PS. alltypesorc is the test ORC table based on data from <enlistment>\data\files\alltypesorc