[ https://issues.apache.org/jira/browse/HIVE-27199?focusedWorklogId=857339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-857339 ]
ASF GitHub Bot logged work on HIVE-27199: ----------------------------------------- Author: ASF GitHub Bot Created on: 17/Apr/23 10:40 Start Date: 17/Apr/23 10:40 Worklog Time Spent: 10m Work Description: TuroczyX commented on code in PR #4170: URL: https://github.com/apache/hive/pull/4170#discussion_r1168500611 ########## common/src/java/org/apache/hive/common/util/TimestampParser.java: ########## @@ -199,6 +205,19 @@ public Timestamp parseTimestamp(final String text) { } + public TimestampTZ parseTimestamp(String text, ZoneId defaultTimeZone) { + Objects.requireNonNull(text); + for (DateTimeFormatter f : dtFormatters) { + try { + return TimestampTZUtil.parse(text, defaultTimeZone, f); + } catch (DateTimeException e) { Review Comment: Also, from pattern perspective a TryParse would be more elegant in this case. Of course it is just preferences, but I like this pattern. Way more descriptive from code reading perspective. https://learn.microsoft.com/en-us/dotnet/api/system.int32.tryparse?view=net-8.0#system-int32-tryparse(system-string-system-int32@) I know ref and out keyword are not exists in Java but with return type it is possible to handle. (Just FYI, no need to change) Issue Time Tracking ------------------- Worklog Id: (was: 857339) Time Spent: 50m (was: 40m) > Read TIMESTAMP WITH LOCAL TIME ZONE columns from text files using custom > formats > -------------------------------------------------------------------------------- > > Key: HIVE-27199 > URL: https://issues.apache.org/jira/browse/HIVE-27199 > Project: Hive > Issue Type: Improvement > Components: Serializers/Deserializers > Affects Versions: 4.0.0-alpha-2 > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Timestamp values come in many flavors and formats and there is no single > representation that can satisfy everyone especially when such values are > stored in plain text/csv files. > HIVE-9298, added a special SERDE property, {{{}timestamp.formats{}}}, that > allows to provide custom timestamp patterns to parse correctly TIMESTAMP > values coming from files. > However, when the column type is TIMESTAMP WITH LOCAL TIME ZONE (LTZ) it is > not possible to use a custom pattern thus when the built-in Hive parser does > not match the expected format a NULL value is returned. > Consider a text file, F1, with the following values: > {noformat} > 2016-05-03 12:26:34 > 2016-05-03T12:26:34 > {noformat} > and a table with a column declared as LTZ. > {code:sql} > CREATE TABLE ts_table (ts TIMESTAMP WITH LOCAL TIME ZONE); > LOAD DATA LOCAL INPATH './F1' INTO TABLE ts_table; > SELECT * FROM ts_table; > 2016-05-03 12:26:34.0 US/Pacific > NULL > {code} > In order to give more flexibility to the users relying on the TIMESTAMP WITH > LOCAL TIME ZONE datatype and also align the behavior with the TIMESTAMP type > this JIRA aims to reuse the {{timestamp.formats}} property for both TIMESTAMP > types. > The work here focuses exclusively on simple text files but the same could be > done for other SERDE such as JSON etc. -- This message was sent by Atlassian Jira (v8.20.10#820010)