[
https://issues.apache.org/jira/browse/IMPALA-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18018013#comment-18018013
]
ASF subversion and git services commented on IMPALA-14383:
----------------------------------------------------------
Commit 0dfed88861c03ecf466f8762ae1f03756518da88 in impala's branch
refs/heads/master from stiga-huang
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=0dfed8886 ]
IMPALA-14383: Fix crash in casting timestamp string with timezone offsets to
DATE
Timestamp string can have a timezone offset at its end, e.g.
"2025-08-31 06:23:24.9392129 +08:00" has "+08:00" as the timezone
offset. When casting strings to DATE type, we try to find the default
format by matching the separators, i.e. '-', ':', ' ', etc in
SimpleDateFormatTokenizer::GetDefaultFormatContext(). The one that
matches this example is DEFAULT_DATE_TIME_CTX[] which represents the
default date/time context for "yyyy-MM-dd HH:mm:ss.SSSSSSSSS". The
fractional part at the end can have length from 0 to 9, matching
DEFAULT_DATE_TIME_CTX[0] to DEFAULT_DATE_TIME_CTX[9] respectively.
When calculating which item in DEFAULT_DATE_TIME_CTX is the matched
format, we use the index as str_len - 20 where 20 is the length of
"yyyy-MM-dd HH:mm:ss.". This causes the index overflow if the string
length is larger than 29. A wild pointer is returned from
GetDefaultFormatContext(), leading crash in following codes.
This patch fixes the issue by adding a check to make sure the string
length is smaller than the max length of the default date time format,
i.e. DEFAULT_DATE_TIME_FMT_LEN (29). Longer strings will use
DateTimeFormatContext created lazily.
Note that this just fixes the crash. Converting timestamp strings with
timezone offset at the end to DATE type is not supported yet and will be
followed up in IMPALA-14391.
Tests
- Added e2e tests on constant expressions. Also added a test table with
such timestamp strings and added test on it.
Change-Id: I36d73f4a71432588732b2284ac66552f75628a62
Reviewed-on: http://gerrit.cloudera.org:8080/23371
Reviewed-by: Daniel Becker <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> DCHECK hit in DateParser::ParseSimpleDateFormat
> -----------------------------------------------
>
> Key: IMPALA-14383
> URL: https://issues.apache.org/jira/browse/IMPALA-14383
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 5.0.0
> Reporter: Riza Suminto
> Assignee: Quanlong Huang
> Priority: Major
> Attachments: resolved-disable-codegen.txt
>
>
> DCHECK hit in DateParser::ParseSimpleDateFormat under DEBUG build for
> following query:
> {code:java}
> select cast('2025-08-31 06:23:24.9392129 +00:00' as DATE); {code}
> The error log shows up like this:
> {code:java}
> F20250902 14:48:45.499524 1866880 date-parse-util.cc:41]
> 3e49b1bed2617add:810c9d2e00000000] Check failed: dt_ctx.has_date_toks
> Minidump in thread [1866880]hiveserver2-frontend-2 running query
> 3e49b1bed2617add:810c9d2e00000000, fragment instance
> 0000000000000000:0000000000000000
> Wrote minidump to
> /home/rsuminto/workspace/impala/logs/cluster/minidumps/impalad/831dc502-7bea-4778-79ba21a4-c8ee66c8.dmp
> {code}
> Attached is the resolved minidump without codegen.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]