[ https://issues.apache.org/jira/browse/HIVE-12192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221507#comment-16221507 ]
Jesus Camacho Rodriguez edited comment on HIVE-12192 at 10/27/17 12:28 AM: --------------------------------------------------------------------------- Very much wip but I had been working on this so I attach a draft to see what ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a couple of constructors for the reader/writer of timestamp values need to be extended so we can specify the timezone ourselves instead of taking the system timezone automatically. was (Author: jcamachorodriguez): Very much wip but I had been working on this so I attach a draft to see what ptest gives. The idea would be to store the timestamp in ORC as UTC too, but a couple of constructors for the reader/writer of timestamp values need to be extended so we can specify the timezone ourselves instead of taking the system timezone. > Hive should carry out timestamp computations in UTC > --------------------------------------------------- > > Key: HIVE-12192 > URL: https://issues.apache.org/jira/browse/HIVE-12192 > Project: Hive > Issue Type: Sub-task > Components: Hive > Reporter: Ryan Blue > Assignee: Jesus Camacho Rodriguez > Labels: timestamp > Attachments: HIVE-12192.patch > > > Hive currently uses the "local" time of a java.sql.Timestamp to represent the > SQL data type TIMESTAMP WITHOUT TIME ZONE. The purpose is to be able to use > {{Timestamp#getYear()}} and similar methods to implement SQL functions like > {{year}}. > When the SQL session's time zone is a DST zone, such as America/Los_Angeles > that alternates between PST and PDT, there are times that cannot be > represented because the effective zone skips them. > {code} > hive> select TIMESTAMP '2015-03-08 02:10:00.101'; > 2015-03-08 03:10:00.101 > {code} > Using UTC instead of the SQL session time zone as the underlying zone for a > java.sql.Timestamp avoids this bug, while still returning correct values for > {{getYear}} etc. Using UTC as the convenience representation (timestamp > without time zone has no real zone) would make timestamp calculations more > consistent and avoid similar problems in the future. > Notably, this would break the {{unix_timestamp}} UDF that specifies the > result is with respect to ["the default timezone and default > locale"|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-DateFunctions]. > That function would need to be updated to use the > {{System.getProperty("user.timezone")}} zone. -- This message was sent by Atlassian JIRA (v6.4.14#64029)