[ https://issues.apache.org/jira/browse/HIVE-20154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Vincent Tran updated HIVE-20154: -------------------------------- Component/s: UDF > Improve unix_timestamp(args) to handle automatic-DST switching timezones > ------------------------------------------------------------------------ > > Key: HIVE-20154 > URL: https://issues.apache.org/jira/browse/HIVE-20154 > Project: Hive > Issue Type: Improvement > Components: UDF > Affects Versions: 1.1.0 > Reporter: Vincent Tran > Priority: Major > > Currently unix_timestamp(args) UDF will only handle static timezone > specifiers. It does not recognize SystemV specifiers such as EST5EDT or > PST8PDT. > Based on this experiment, when z is used to parse a TZ string like UTC4PDT > (obviously not a valid SystemV specifier) - it will parse the time as UTC. > When z<n>z is used to parse a TZ string like UTC4PDT, it will take parse the > timestamp as the TZ of the final z position. This is demonstrated by my final > query when the format string z4z1z is used to parse UTC4PDT1EDT. > {noformat} > 0: jdbc:hive2://localhost:10000/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "yyyy-MM-dd > HH:mm:ss z"), "yyyy-MM-dd HH:mm:ss zzzz"); > +--------------------------------------------+--+ > | _c0 | > +--------------------------------------------+--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > +--------------------------------------------+--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:10000/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC", "yyyy-MM-dd HH:mm:ss > z"), "yyyy-MM-dd HH:mm:ss zzzz"); > +--------------------------------------------+--+ > | _c0 | > +--------------------------------------------+--+ > | 2018-01-31 16:00:00 Pacific Standard Time | > +--------------------------------------------+--+ > 1 row selected (0.041 seconds) > 0: jdbc:hive2://localhost:10000/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT", "yyyy-MM-dd > HH:mm:ss z4z"), "yyyy-MM-dd HH:mm:ss zzzz"); > +--------------------------------------------+--+ > | _c0 | > +--------------------------------------------+--+ > | 2018-01-31 23:00:00 Pacific Standard Time | > +--------------------------------------------+--+ > 1 row selected (0.047 seconds) > 0: jdbc:hive2://localhost:10000/default>; select > from_unixtime(unix_timestamp("2018-02-01 00:00:00 UTC4PDT1EDT", "yyyy-MM-dd > HH:mm:ss z4z1z"), "yyyy-MM-dd HH:mm:ss zzzz"); > +--------------------------------------------+--+ > | _c0 | > +--------------------------------------------+--+ > | 2018-01-31 20:00:00 Pacific Standard Time | > +--------------------------------------------+--+ > 1 row selected (0.055 seconds) > 0: jdbc:hive2://localhost:10000/default>; > {noformat} > So all in all, I don't think the SystemV specifier EST5EDT or PST8PDT are > valid to unix_timestamp(args) at all. And that those when parsed with the > z<n>z format string, will be read as whatever valid timezone at the final > position (effectively EDT and PDT respectively in when those valid SystemV TZ > specifiers above are used). -- This message was sent by Atlassian JIRA (v7.6.3#76005)