[
https://issues.apache.org/jira/browse/IMPALA-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18017841#comment-18017841
]
Quanlong Huang commented on IMPALA-14383:
-----------------------------------------
The following queries can also lead to impalad crash:
{code:sql}
select cast('2025-08-31 06:23:24.1234567890' as DATE);
select cast('2025-08-31 06:23:24.123456789abcd' as DATE);
select cast('aaaa-aa-aa aa:aa:aa.123456789abcd' as DATE);
{code}
The reason is that we just check separators, e.g. '-', ':', ' ', '.' in the
string when checking its format which happens to fall in the type of default
date/time context, i.e. "yyyy-MM-dd HH:mm:ss.SSSSSSSSS". The max length of the
fractional part is 9. So if the substring after period "." has a length longer
than 9, we return a wild pointer here:
{code:cpp}
const DateTimeFormatContext* SimpleDateFormatTokenizer::GetDefaultFormatContext(
const char* str, int len, bool accept_time_toks, bool
accept_time_toks_only) {
...
default: {
// There is likely a fractional component that's below the expected 9
chars.
// We will need to work out which default context to use that
corresponds to
// the fractional length in the string.
if (LIKELY(len > DEFAULT_SHORT_DATE_TIME_FMT_LEN)
&& LIKELY(str[19] == '.') && LIKELY(str[13] == ':')) {
switch (str[10]) {
case ' ': {
return &DEFAULT_DATE_TIME_CTX[len -
DEFAULT_SHORT_DATE_TIME_FMT_LEN - 1]; // <-- index could overflow here
}
{code}
https://github.com/apache/impala/blob/f24296aed58b75a8e1d8851c5a94c6d362515bc8/be/src/runtime/datetime-simple-date-format-parser.cc#L407
In the above code, len is 34 and DEFAULT_SHORT_DATE_TIME_FMT_LEN is 19. So the
index is 14. However, the array length of DEFAULT_DATE_TIME_CTX is just 10.
> DCHECK hit in DateParser::ParseSimpleDateFormat
> -----------------------------------------------
>
> Key: IMPALA-14383
> URL: https://issues.apache.org/jira/browse/IMPALA-14383
> Project: IMPALA
> Issue Type: Bug
> Affects Versions: Impala 5.0.0
> Reporter: Riza Suminto
> Assignee: Quanlong Huang
> Priority: Major
> Attachments: resolved-disable-codegen.txt
>
>
> DCHECK hit in DateParser::ParseSimpleDateFormat under DEBUG build for
> following query:
> {code:java}
> select cast('2025-08-31 06:23:24.9392129 +00:00' as DATE); {code}
> The error log shows up like this:
> {code:java}
> F20250902 14:48:45.499524 1866880 date-parse-util.cc:41]
> 3e49b1bed2617add:810c9d2e00000000] Check failed: dt_ctx.has_date_toks
> Minidump in thread [1866880]hiveserver2-frontend-2 running query
> 3e49b1bed2617add:810c9d2e00000000, fragment instance
> 0000000000000000:0000000000000000
> Wrote minidump to
> /home/rsuminto/workspace/impala/logs/cluster/minidumps/impalad/831dc502-7bea-4778-79ba21a4-c8ee66c8.dmp
> {code}
> Attached is the resolved minidump without codegen.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]