Yibing Shi created HIVE-14205:
---------------------------------

             Summary: Hive doesn't support union type with AVRO file format
                 Key: HIVE-14205
                 URL: https://issues.apache.org/jira/browse/HIVE-14205
             Project: Hive
          Issue Type: Bug
          Components: Serializers/Deserializers
            Reporter: Yibing Shi


Reproduce steps:
{noformat}
hive> CREATE TABLE avro_union_test
    > PARTITIONED BY (p int)
    > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    > STORED AS INPUTFORMAT 
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    > OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    > TBLPROPERTIES ('avro.schema.literal'='{
    >    "type":"record",
    >    "name":"nullUnionTest",
    >    "fields":[
    >       {
    >          "name":"value",
    >          "type":[
    >             "null",
    >             "int",
    >             "long"
    >          ],
    >          "default":null
    >       }
    >    ]
    > }');
OK
Time taken: 0.105 seconds
hive> alter table avro_union_test add partition (p=1);
OK
Time taken: 0.093 seconds
hive> select * from avro_union_test;
FAILED: RuntimeException org.apache.hadoop.hive.ql.metadata.HiveException: 
Failed with exception Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported 
yet.java.lang.RuntimeException: Hive internal error inside 
isAssignableFromSettablePrimitiveOI void not supported yet.
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettablePrimitiveOI(ObjectInspectorUtils.java:1140)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.isInstanceOfSettableOI(ObjectInspectorUtils.java:1149)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1187)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1220)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.hasAllFieldsSettable(ObjectInspectorUtils.java:1200)
        at 
org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConvertedOI(ObjectInspectorConverters.java:219)
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.setupOutputObjectInspector(FetchOperator.java:581)
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.initialize(FetchOperator.java:172)
        at 
org.apache.hadoop.hive.ql.exec.FetchOperator.<init>(FetchOperator.java:140)
        at 
org.apache.hadoop.hive.ql.exec.FetchTask.initialize(FetchTask.java:79)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:482)
        at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:311)
        at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1194)
        at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1289)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1120)
        at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1108)
        at 
org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:218)
        at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:170)
        at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:381)
        at 
org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:773)
        at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:691)
        at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:626)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:497)
        at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
{noformat}

Another test case to show this problem is:
{noformat}
hive> create table avro_union_test2 (value uniontype<int,bigint>) stored as 
avro;
OK
Time taken: 0.053 seconds
hive> show create table avro_union_test2;
OK
CREATE TABLE `avro_union_test2`(
  `value` uniontype<void,int,bigint> COMMENT '')
ROW FORMAT SERDE
  'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
  'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
LOCATION
  'hdfs://localhost/user/hive/warehouse/avro_union_test2'
TBLPROPERTIES (
  'transient_lastDdlTime'='1468173589')
Time taken: 0.051 seconds, Fetched: 12 row(s)
{noformat}

Although column {{value}} is defined as {{uniontype<int,bigint>}} in create 
table command, its type becomes {{uniontype<void,int,bigint>}} after table is 
defined. Hive accidentally make the nullable definition in avro schema 
({{\["null", "int", "long"\]}})  into union definition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to