[ https://issues.apache.org/jira/browse/HIVE-5820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822462#comment-13822462 ]
hellojinjie commented on HIVE-5820: ----------------------------------- Same environment(CDH 4), same version of Hive, I also met the problem. I use avro.schema.url to specify the url of the avsc {code} TBLPROPERTIES ( 'avro.schema.url'='hdfs:///user/qos/workspaces/avro/Message.avsc') {code} During the query, the log also print the same error message. But it does not affect the execution of Hive query. > Neither avro.schema.literal nor avro.schema.url specified, can't determine > table schema > --------------------------------------------------------------------------------------- > > Key: HIVE-5820 > URL: https://issues.apache.org/jira/browse/HIVE-5820 > Project: Hive > Issue Type: Bug > Affects Versions: 0.10.0 > Environment: CDH 4.3 Hive 0.10.0+121 > Reporter: Sergey > > Hi, we've created a table: > {code} > create table tmp > comment 'tmp' > partitioned by (year string, month string, day string, fulldate string) > row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' > stored as > inputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' > outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' > location '/user/lol/tmp' > tblproperties ('avro.schema.literal' = > '{"name": "tmp", "doc": "version 0.0.1", "type": "record", "fields": [ > {"name": "a", "type": "int"}, > {"name": "b", "type": "int"} > ]}' > ) > {code} > And we try to query it: > {code} > select * from tmp > {code} > and we get an exception > {code} > 13/11/14 17:12:15 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException > determining schema. Returning signal schema to indicate problem > org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither > avro.schema.literal nor avro.schema.url specified, can't determine table > schema > at > org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:66) > at > org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:87) > at > org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:59) > at > org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:249) > at > org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:251) > at > org.apache.hadoop.hive.ql.metadata.Partition.initialize(Partition.java:217) > at > org.apache.hadoop.hive.ql.metadata.Partition.<init>(Partition.java:107) > at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1573) > at > org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:190) > at > org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:561) > at > org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.checkTree(SimpleFetchOptimizer.java:144) > at > org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.optimize(SimpleFetchOptimizer.java:100) > at > org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.transform(SimpleFetchOptimizer.java:74) > at > org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102) > at > org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8200) > at > org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:457) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349) > at > com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.checkedCompile(BeeswaxServiceImpl.java:247) > at > com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.compile(BeeswaxServiceImpl.java:200) > at > com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:830) > at > com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:823) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at > com.cloudera.beeswax.BeeswaxServiceImpl.doWithState(BeeswaxServiceImpl.java:772) > at > com.cloudera.beeswax.BeeswaxServiceImpl.query(BeeswaxServiceImpl.java:822) > at > com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:915) > at > com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:899) > at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) > at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206) > at > java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > at java.lang.Thread.run(Thread.java:662) > 13/11/14 17:12:15 INFO parse.SemanticAnalyzer: Completed plan generation > 13/11/14 17:12:15 INFO ql.Driver: Semantic Analysis Completed > {code} > here is describe: > {code} > 0 a int from deserializer > 1 b int from deserializer > 2 year string > 3 month string > 4 day string > 5 fulldate string > 6 > 7 Detailed Table Information Table(tableName:tmp, dbName:default, > owner:devops, createTime:1384435112, lastAccessTime:0, retention:0, > sd:StorageDescriptor(cols:[], location:hdfs://nameservice1/user/fedyakov/tmp, > inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, > outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, > compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, > serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, > parameters:{serialization.format=1}), bucketCols:[], sortCols:[], > parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], > skewedColValueLocationMaps:{}), storedAsSubDirectories:false), > partitionKeys:[FieldSchema(name:year, type:string, comment:null), > FieldSchema(name:month, type:string, comment:null), FieldSchema(name:day, > type:string, comment:null), FieldSchema(name:fulldate, type:string, > comment:null)], parameters:{numPartitions=1, numFiles=1, > avro.schema.literal={"name": "tmp", "doc": "version 0.0.1", "type": "record", > "fields": [ > 8 {"name": "a", "type": "int"}, > 9 {"name": "b", "type": "int"} > 10 ]}, transient_lastDdlTime=1384435137, numRows=0, totalSize=189, > rawDataSize=0}, viewOriginalText:null, viewExpandedText:null, > tableType:MANAGED_TABLE) > {code} > If we sepcify file instead of "embedded" avro schema, it works. -- This message was sent by Atlassian JIRA (v6.1#6144)