[ 
https://issues.apache.org/jira/browse/IMPALA-14383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18017841#comment-18017841
 ] 

Quanlong Huang commented on IMPALA-14383:
-----------------------------------------

The following queries can also lead to impalad crash:

{code:sql}
select cast('2025-08-31 06:23:24.1234567890' as DATE);
select cast('2025-08-31 06:23:24.123456789abcd' as DATE);
select cast('aaaa-aa-aa aa:aa:aa.123456789abcd' as DATE);
{code}

The reason is that we just check separators, e.g. '-', ':', ' ', '.' in the 
string when checking its format which happens to fall in the type of default 
date/time context, i.e. "yyyy-MM-dd HH:mm:ss.SSSSSSSSS". The max length of the 
fractional part is 9. So if the substring after period "." has a length longer 
than 9, we return a wild pointer here:
{code:cpp}
const DateTimeFormatContext* SimpleDateFormatTokenizer::GetDefaultFormatContext(
    const char* str, int len, bool accept_time_toks, bool 
accept_time_toks_only) {
  ...
        default: {
          // There is likely a fractional component that's below the expected 9 
chars.
          // We will need to work out which default context to use that 
corresponds to
          // the fractional length in the string.
          if (LIKELY(len > DEFAULT_SHORT_DATE_TIME_FMT_LEN)
              && LIKELY(str[19] == '.') && LIKELY(str[13] == ':')) {
            switch (str[10]) {
              case ' ': {
                return &DEFAULT_DATE_TIME_CTX[len - 
DEFAULT_SHORT_DATE_TIME_FMT_LEN - 1];  // <-- index could overflow here
              }
{code}
https://github.com/apache/impala/blob/f24296aed58b75a8e1d8851c5a94c6d362515bc8/be/src/runtime/datetime-simple-date-format-parser.cc#L407
In the above code, len is 34 and DEFAULT_SHORT_DATE_TIME_FMT_LEN is 19. So the 
index is 14. However, the array length of DEFAULT_DATE_TIME_CTX is just 10.

> DCHECK hit in DateParser::ParseSimpleDateFormat
> -----------------------------------------------
>
>                 Key: IMPALA-14383
>                 URL: https://issues.apache.org/jira/browse/IMPALA-14383
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 5.0.0
>            Reporter: Riza Suminto
>            Assignee: Quanlong Huang
>            Priority: Major
>         Attachments: resolved-disable-codegen.txt
>
>
> DCHECK hit in DateParser::ParseSimpleDateFormat under DEBUG build for 
> following query:
> {code:java}
> select cast('2025-08-31 06:23:24.9392129 +00:00' as DATE); {code}
> The error log shows up like this:
> {code:java}
> F20250902 14:48:45.499524 1866880 date-parse-util.cc:41] 
> 3e49b1bed2617add:810c9d2e00000000] Check failed: dt_ctx.has_date_toks
> Minidump in thread [1866880]hiveserver2-frontend-2 running query 
> 3e49b1bed2617add:810c9d2e00000000, fragment instance 
> 0000000000000000:0000000000000000
> Wrote minidump to 
> /home/rsuminto/workspace/impala/logs/cluster/minidumps/impalad/831dc502-7bea-4778-79ba21a4-c8ee66c8.dmp
>  {code}
> Attached is the resolved minidump without codegen.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to