Sergey created HIVE-5820:
----------------------------

             Summary: Neither avro.schema.literal nor avro.schema.url 
specified, can't determine table schema
                 Key: HIVE-5820
                 URL: https://issues.apache.org/jira/browse/HIVE-5820
             Project: Hive
          Issue Type: Bug
    Affects Versions: 0.10.0
         Environment: CDH 4.3  Hive 0.10.0+121
            Reporter: Sergey


Hi, we've created a table:
{code}
create table tmp
comment 'tmp'
partitioned by (year string, month string, day string, fulldate string)
row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
stored as
    inputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
location '/user/lol/tmp'
tblproperties ('avro.schema.literal' =
    '{"name": "tmp", "doc": "version 0.0.1", "type": "record", "fields": [
        {"name": "a", "type": "int"},
        {"name": "b", "type": "int"}
    ]}'
)
{code}

And we try to query it:
{code}
select * from tmp
{code}

and we get an exception
{code}
13/11/14 17:12:15 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException 
determining schema. Returning signal schema to indicate problem
org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither 
avro.schema.literal nor avro.schema.url specified, can't determine table schema
        at 
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:66)
        at 
org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:87)
        at 
org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:59)
        at 
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:249)
        at 
org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:251)
        at 
org.apache.hadoop.hive.ql.metadata.Partition.initialize(Partition.java:217)
        at 
org.apache.hadoop.hive.ql.metadata.Partition.<init>(Partition.java:107)
        at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1573)
        at 
org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:190)
        at 
org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:561)
        at 
org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.checkTree(SimpleFetchOptimizer.java:144)
        at 
org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.optimize(SimpleFetchOptimizer.java:100)
        at 
org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.transform(SimpleFetchOptimizer.java:74)
        at 
org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102)
        at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8200)
        at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:457)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.checkedCompile(BeeswaxServiceImpl.java:247)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.compile(BeeswaxServiceImpl.java:200)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:830)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:823)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl.doWithState(BeeswaxServiceImpl.java:772)
        at 
com.cloudera.beeswax.BeeswaxServiceImpl.query(BeeswaxServiceImpl.java:822)
        at 
com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:915)
        at 
com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:899)
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
13/11/14 17:12:15 INFO parse.SemanticAnalyzer: Completed plan generation
13/11/14 17:12:15 INFO ql.Driver: Semantic Analysis Completed
{code}

here is describe:
{code}
0       a       int     from deserializer
1       b       int     from deserializer
2       year    string  
3       month   string  
4       day     string  
5       fulldate        string  
6                       
7       Detailed Table Information      Table(tableName:tmp, dbName:default, 
owner:devops, createTime:1384435112, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[], location:hdfs://nameservice1/user/fedyakov/tmp, 
inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, 
parameters:{serialization.format=1}), bucketCols:[], sortCols:[], 
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{}), storedAsSubDirectories:false), 
partitionKeys:[FieldSchema(name:year, type:string, comment:null), 
FieldSchema(name:month, type:string, comment:null), FieldSchema(name:day, 
type:string, comment:null), FieldSchema(name:fulldate, type:string, 
comment:null)], parameters:{numPartitions=1, numFiles=1, 
avro.schema.literal={"name": "tmp", "doc": "version 0.0.1", "type": "record", 
"fields": [ 
8        {"name": "a", "type": "int"},          
9        {"name": "b", "type": "int"}           
10       ]}, transient_lastDdlTime=1384435137, numRows=0, totalSize=189, 
rawDataSize=0}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE) 
{code}

If we sepcify file instead of "embedded" avro schema, it works.




--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to