[ 
https://issues.apache.org/jira/browse/HIVE-21002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772078#comment-16772078
 ] 

Zoltan Ivanfi commented on HIVE-21002:
--------------------------------------

As we discussed on the Hive mailing list, I modified the sub-tasks of this JIRA 
to reflect the new solution we agreed upon: The historical (backwards- and 
forwards-compatible) way of handling timestamps should be restored while 
keeping the new semantics at the same time. The details can be read in 
descriptions of the sub-tasks.

> Backwards incompatible change: Hive 3.1 reads back Avro and Parquet 
> timestamps written by Hive 2.x incorrectly
> --------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-21002
>                 URL: https://issues.apache.org/jira/browse/HIVE-21002
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 3.1.1
>            Reporter: Zoltan Ivanfi
>            Priority: Major
>
> Hive 3.1 reads back Avro and Parquet timestamps written by Hive 2.x 
> incorrectly. As an example session to demonstrate this problem, create a 
> dataset using Hive version 2.x in America/Los_Angeles:
> {code:sql}
> hive> create table ts_‹format› (ts timestamp) stored as ‹format›;
> hive> insert into ts_‹format› values (*‘2018-01-01 00:00:00.000’*);
> {code}
> Querying this table by issuing
> {code:sql}
> hive> select * from ts_‹format›;
> {code}
> from different time zones using different versions of Hive and different 
> storage formats gives the following results:
> |‹format›|Writer time zone (in Hive 2.x)|Reader time zone|Result in Hive 2.x 
> reader|Result in Hive 3.1 reader|
> |Avro and Parquet|America/Los_Angeles|America/Los_Angeles|2018-01-01 
> *00*:00:00.0|2018-01-01 *08*:00:00.0|
> |Avro and Parquet|America/Los_Angeles|Europe/Paris|2018-01-01 
> *09*:00:00.0|2018-01-01 *08*:00:00.0|
> |Textfile and ORC|America/Los_Angeles|America/Los_Angeles|2018-01-01 
> 00:00:00.0|2018-01-01 00:00:00.0|
> |Textfile and ORC|America/Los_Angeles|Europe/Paris|2018-01-01 
> 00:00:00.0|2018-01-01 00:00:00.0|
> *Hive 3.1 clearly gives different results than Hive 2.x for timestamps stored 
> in Avro and Parquet formats.* Apache ORC behaviour has not changed because it 
> was modified to adjust timestamps to retain backwards compatibility. Textfile 
> behaviour has not changed, because its processing involves parsing and 
> formatting instead of proper serializing and deserializing, so they 
> inherently had LocalDateTime semantics even in Hive 2.x.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to