[ 
https://issues.apache.org/jira/browse/HIVE-6785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13958074#comment-13958074
 ] 

Szehon Ho commented on HIVE-6785:
---------------------------------

OK I'm not a huge fan of moving that inspector to a unnatural place because it 
will be stuck like that going forward in hive, but we can let others also chime 
in.  

If its really important to support for earlier hive, maybe one option is to 
back-port a different version of the patch into parquet?

> query fails when partitioned table's table level serde is ParquetHiveSerDe 
> and partition level serde is of different SerDe
> --------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-6785
>                 URL: https://issues.apache.org/jira/browse/HIVE-6785
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats, Serializers/Deserializers
>    Affects Versions: 0.13.0
>            Reporter: Tongjie Chen
>         Attachments: HIVE-6785.1.patch.txt
>
>
> When a hive table's SerDe is ParquetHiveSerDe, while some partitions are of 
> other SerDe, AND if this table has string column[s], hive generates confusing 
> error message:
> "Failed with exception java.io.IOException:java.lang.ClassCastException: 
> parquet.hive.serde.primitive.ParquetStringInspector cannot be cast to 
> org.apache.hadoop.hive.serde2.objectinspector.primitive.SettableTimestampObjectInspector"
> This is confusing because timestamp is mentioned even if it is not been used 
> by the table. The reason is when there is SerDe difference between table and 
> partition, hive tries to convert objectinspector of two SerDes. 
> ParquetHiveSerDe's object inspector for string type is ParquetStringInspector 
> (newly introduced), neither a subclass of WritableStringObjectInspector nor 
> JavaStringObjectInspector, which ObjectInspectorConverters expect for string 
> category objector inspector. There is no break statement in STRING case 
> statement, hence the following TIMESTAMP case statement is executed, 
> generating confusing error message.
> see also in the following parquet issue:
> https://github.com/Parquet/parquet-mr/issues/324
> To fix that it is relatively easy, just make ParquetStringInspector subclass 
> of JavaStringObjectInspector instead of AbstractPrimitiveJavaObjectInspector. 
> But because constructor of class JavaStringObjectInspector is package scope 
> instead of public or protected, we would need to move ParquetStringInspector 
> to the same package with JavaStringObjectInspector.
> Also ArrayWritableObjectInspector's setStructFieldData needs to also accept 
> List data, since the corresponding setStructFieldData and create methods 
> return a list. This is also needed when table SerDe is ParquetHiveSerDe, and 
> partition SerDe is something else.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to