[ https://issues.apache.org/jira/browse/HIVE-25292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
shezm reassigned HIVE-25292: ---------------------------- Assignee: shezm > to_unix_timestamp & unix_timestamp should support ENGLISH format by default > --------------------------------------------------------------------------- > > Key: HIVE-25292 > URL: https://issues.apache.org/jira/browse/HIVE-25292 > Project: Hive > Issue Type: Improvement > Components: Clients > Reporter: shezm > Assignee: shezm > Priority: Major > Fix For: 3.2.0 > > > Hei > The to_unix_timestamp function is implemented by GenericUDFToUnixTimeStamp. > It uses SimpleDateFormat to parse the time of the string type. > But SimpleDateFormat does not specify the Locale parameter, that is, the > default locale of the jvm machine will be used. This will cause some > non-English local machines to be unable to run similar sql like : > > {code:java} > hive> select to_unix_timestamp('16/Mar/2017:12:25:01', 'dd/MMM/yyy:HH:mm:ss'); > OK > NULLhive> select unix_timestamp('16/Mar/2017:12:25:01', > 'dd/MMM/yyy:HH:mm:ss'); > OK > NULL > {code} > > At the same time, I found that in spark, to_unix_timestamp & unix_timestamp > also use SimpleDateFormat, and spark uses Locale.US by default, but this will > make it impossible to use local language syntax. For example, in the Chinese > environment, I can parse this result correctly in hive, > > {code:java} > hive> select to_unix_timestamp('16/三月/2017:12:25:01', 'dd/MMMM/yyy:HH:mm:ss'); > OK > 1489638301 > Time taken: 0.147 seconds, Fetched: 1 row(s) > OK > NULL > {code} > But spark will return Null. > Because English dates are more common dates, I think two SimpleDateFormats > are needed. The new SimpleDateFormat is initialized with the Locale.ENGLISH > parameter. > -- This message was sent by Atlassian Jira (v8.3.4#803005)