[ https://issues.apache.org/jira/browse/HIVE-6784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13961358#comment-13961358 ]
Hive QA commented on HIVE-6784: ------------------------------- {color:green}Overall{color}: +1 all checks pass Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12638877/HIVE-6784.2.patch.txt {color:green}SUCCESS:{color} +1 5550 tests passed Test results: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2147/testReport Console output: http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/2147/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase {noformat} This message is automatically generated. ATTACHMENT ID: 12638877 > parquet-hive should allow column type change > -------------------------------------------- > > Key: HIVE-6784 > URL: https://issues.apache.org/jira/browse/HIVE-6784 > Project: Hive > Issue Type: Bug > Components: File Formats, Serializers/Deserializers > Affects Versions: 0.13.0 > Reporter: Tongjie Chen > Fix For: 0.14.0 > > Attachments: HIVE-6784.1.patch.txt, HIVE-6784.2.patch.txt > > > see also in the following parquet issue: > https://github.com/Parquet/parquet-mr/issues/323 > Currently, if we change parquet format hive table using "alter table > parquet_table change c1 c1 bigint " ( assuming original type of c1 is int), > it will result in exception thrown from SerDe: > "org.apache.hadoop.io.IntWritable cannot be cast to > org.apache.hadoop.io.LongWritable" in query runtime. > This is different behavior from hive (using other file format), where it will > try to perform cast (null value in case of incompatible type). > Parquet Hive's RecordReader returns an ArrayWritable (based on schema stored > in footers of parquet files); ParquetHiveSerDe also creates an corresponding > ArrayWritableObjectInspector (but using column type info from metastore). > Whenever there is column type change, the objector inspector will throw > exception, since WritableLongObjectInspector cannot inspect an IntWritable > etc... > Conversion has to happen somewhere if we want to allow type change. SerDe's > deserialize method seems a natural place for it. > Currently, serialize method calls createStruct (then createPrimitive) for > every record, but it creates a new object regardless, which seems expensive. > I think that could be optimized a bit by just returning the object passed if > already of the right type. deserialize also reuse this method, if there is a > type change, there will be new object to be created, which I think is > inevitable. -- This message was sent by Atlassian JIRA (v6.2#6252)