[ https://issues.apache.org/jira/browse/HIVE-21002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772078#comment-16772078 ]
Zoltan Ivanfi commented on HIVE-21002: -------------------------------------- As we discussed on the Hive mailing list, I modified the sub-tasks of this JIRA to reflect the new solution we agreed upon: The historical (backwards- and forwards-compatible) way of handling timestamps should be restored while keeping the new semantics at the same time. The details can be read in descriptions of the sub-tasks. > Backwards incompatible change: Hive 3.1 reads back Avro and Parquet > timestamps written by Hive 2.x incorrectly > -------------------------------------------------------------------------------------------------------------- > > Key: HIVE-21002 > URL: https://issues.apache.org/jira/browse/HIVE-21002 > Project: Hive > Issue Type: Bug > Affects Versions: 3.1.0, 3.1.1 > Reporter: Zoltan Ivanfi > Priority: Major > > Hive 3.1 reads back Avro and Parquet timestamps written by Hive 2.x > incorrectly. As an example session to demonstrate this problem, create a > dataset using Hive version 2.x in America/Los_Angeles: > {code:sql} > hive> create table ts_‹format› (ts timestamp) stored as ‹format›; > hive> insert into ts_‹format› values (*‘2018-01-01 00:00:00.000’*); > {code} > Querying this table by issuing > {code:sql} > hive> select * from ts_‹format›; > {code} > from different time zones using different versions of Hive and different > storage formats gives the following results: > |‹format›|Writer time zone (in Hive 2.x)|Reader time zone|Result in Hive 2.x > reader|Result in Hive 3.1 reader| > |Avro and Parquet|America/Los_Angeles|America/Los_Angeles|2018-01-01 > *00*:00:00.0|2018-01-01 *08*:00:00.0| > |Avro and Parquet|America/Los_Angeles|Europe/Paris|2018-01-01 > *09*:00:00.0|2018-01-01 *08*:00:00.0| > |Textfile and ORC|America/Los_Angeles|America/Los_Angeles|2018-01-01 > 00:00:00.0|2018-01-01 00:00:00.0| > |Textfile and ORC|America/Los_Angeles|Europe/Paris|2018-01-01 > 00:00:00.0|2018-01-01 00:00:00.0| > *Hive 3.1 clearly gives different results than Hive 2.x for timestamps stored > in Avro and Parquet formats.* Apache ORC behaviour has not changed because it > was modified to adjust timestamps to retain backwards compatibility. Textfile > behaviour has not changed, because its processing involves parsing and > formatting instead of proper serializing and deserializing, so they > inherently had LocalDateTime semantics even in Hive 2.x. -- This message was sent by Atlassian JIRA (v7.6.3#76005)