linshangquan created FLINK-35318:
------------------------------------

             Summary: incorrect timezone handling for 
TIMESTAMP_WITH_LOCAL_TIME_ZONE type during predicate pushdown
                 Key: FLINK-35318
                 URL: https://issues.apache.org/jira/browse/FLINK-35318
             Project: Flink
          Issue Type: Bug
          Components: Table SQL / API
    Affects Versions: 1.16.1
         Environment: flink version 1.16.1

iceberg version 1.14.3
            Reporter: linshangquan
         Attachments: image-2024-05-09-14-06-58-007.png, 
image-2024-05-09-14-09-38-453.png, image-2024-05-09-14-11-38-476.png

In our scenario, we have an Iceberg table that contains a column named 'time' 
of the {{timestamptz}} data type. This column has 10 rows of data where the 
'time' value is {{'2024-04-30 07:00:00'}} expressed in the "Asia/Shanghai" 
timezone.

!image-2024-05-09-14-06-58-007.png!

 

We encountered a strange phenomenon when accessing the table using 
Iceberg-flink.

When the {{WHERE}} clause includes the {{time}} column, the results are 
incorrect.

ZoneId.{_}systemDefault{_}() = "Asia/Shanghai" 

!image-2024-05-09-14-09-38-453.png!

When there is no {{WHERE}} clause, the results are correct.

During debugging, we found that when a {{WHERE}} clause is present, a 
{{FilterPushDownSpec}} is generated, and this {{FilterPushDownSpec}} utilizes 
{{RexNodeToExpressionConverter}} for translation.

!image-2024-05-09-14-11-38-476.png!

When {{RexNodeToExpressionConverter#visitLiteral}} encounters a 
{{TIMESTAMP_WITH_LOCAL_TIME_ZONE}} type, it uses the specified timezone 
"Asia/Shanghai" to convert the {{TimestampString}} type to an {{Instant}} type. 
However, the upstream {{TimestampString}} data has already been processed in 
UTC timezone. By applying the local timezone processing here, an error occurs 
due to the mismatch in timezones.

Whether the handling of {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}} type of data in 
{{RexNodeToExpressionConverter#visitLiteral}} is a bug, and whether it should 
process the data in UTC timezone.

 
Please help confirm if this is the issue, and if so, we can submit a patch to 
fix it.
 
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to