[ https://issues.apache.org/jira/browse/HIVE-26373?focusedWorklogId=788618&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788618 ]
ASF GitHub Bot logged work on HIVE-26373: ----------------------------------------- Author: ASF GitHub Bot Created on: 07/Jul/22 13:05 Start Date: 07/Jul/22 13:05 Worklog Time Spent: 10m Work Description: zabetak commented on PR #3418: URL: https://github.com/apache/hive/pull/3418#issuecomment-1177583388 Hive has been always converting data from local time zone to UTC when writing and from UTC to local time zone when reading. I updated the way the the timestamp is stored in HBase (https://github.com/apache/hive/pull/3418/commits/fc9bc94be427a02485b089c2aeb6b494644beb05) to make it coherent with the way it is read by the query. There are properties and Avro file metadata which can control if we want to perform the conversion or not (e.g., `hive.avro.timestamp.skip.conversion`) but these are not working at the moment for HBase (and basically anything that relies on `AvroLazyObjectInspector`). This is a bug that should be fixed but it is out of the scope of this PR. Issue Time Tracking ------------------- Worklog Id: (was: 788618) Time Spent: 40m (was: 0.5h) > ClassCastException while inserting Avro data into Hbase table for nested > struct with Timestamp > ---------------------------------------------------------------------------------------------- > > Key: HIVE-26373 > URL: https://issues.apache.org/jira/browse/HIVE-26373 > Project: Hive > Issue Type: Bug > Components: Hive > Reporter: Soumyakanti Das > Assignee: Soumyakanti Das > Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > For Avro data where the schema has nested struct with a Timestamp datatype, > we get the following ClassCastException: > {code:java} > 2022-07-05T08:40:51,572 ERROR [LocalJobRunner Map Task Executor #0] > mr.ExecMapper: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime > Error while processing row > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:573) > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:148) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54) > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapRunner.run(ExecMapRunner.java:37) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:465) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:271) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.ClassCastException: > org.apache.hadoop.hive.common.type.Timestamp cannot be cast to > org.apache.hadoop.hive.serde2.lazy.LazyPrimitive > at > org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.AbstractPrimitiveLazyObjectInspector.getPrimitiveWritableObject(AbstractPrimitiveLazyObjectInspector.java:40) > at > org.apache.hadoop.hive.serde2.lazy.objectinspector.primitive.LazyTimestampObjectInspector.getPrimitiveWritableObject(LazyTimestampObjectInspector.java:29) > at > org.apache.hadoop.hive.serde2.lazy.LazyUtils.writePrimitiveUTF8(LazyUtils.java:308) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serialize(LazySimpleSerDe.java:292) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.serializeField(LazySimpleSerDe.java:247) > at > org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe.doSerialize(LazySimpleSerDe.java:231) > at > org.apache.hadoop.hive.serde2.AbstractEncodingAwareSerDe.serialize(AbstractEncodingAwareSerDe.java:55) > at > org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1059) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937) > at > org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95) > at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:937) > at > org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:128) > at > org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:152) > at > org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:552) > ... 11 more {code} > The problem starts in {{toLazyObject}} method of > {*}AvroLazyObjectInspector.java{*}, when > [this|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/avro/AvroLazyObjectInspector.java#L347] > condition returns false for {*}Timestamp{*}, preventing the conversion of > *Timestamp* to *LazyTimestamp* > [here|https://github.com/apache/hive/blob/53009126f6fe7ccf24cf052fd6c156542f38b19d/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java#L132]. > The solution is to return {{true}} for Timestamps in the {{isPrimitive}} > method. -- This message was sent by Atlassian Jira (v8.20.10#820010)