[ https://issues.apache.org/jira/browse/FLINK-30959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685777#comment-17685777 ]
Godfrey He commented on FLINK-30959: ------------------------------------ Thanks for reporting this issue. [~yunfengzhou] Currently, the behavior of data with timezone is not defined, the description of \{{UNIX_TIMESTAMP(string1[, string2])}} in Flink document is C{{{}onverts date time string string1 in format string2 (by default: yyyy-MM-dd HH:mm:ss if not specified) to Unix timestamp (in seconds), using the specified timezone in table config. {}}}which means we should always use the specified timezone in table. config to parse data. I think the behavior for \{{yyyy-MM-dd HH:mm:ss.SSS X}} is not considered before, and I tend to use the time zone in the record if the format and the record both have time zone. If that, the result is correct. need more discussion determine the behavior. cc [~Leonard] [~jark] [~twalthr] > UNIX_TIMESTAMP's return value does not meet expected > ---------------------------------------------------- > > Key: FLINK-30959 > URL: https://issues.apache.org/jira/browse/FLINK-30959 > Project: Flink > Issue Type: Bug > Components: Table SQL / API > Affects Versions: 1.15.2 > Reporter: Yunfeng Zhou > Priority: Major > > When running the following pyflink program > > {code:python} > import pandas as pd > from pyflink.datastream import StreamExecutionEnvironment, HashMapStateBackend > from pyflink.table import StreamTableEnvironment > if __name__ == "__main__": > input_data = pd.DataFrame( > [ > ["Alex", 100.0, "2022-01-01 08:00:00.001 +0800"], > ["Emma", 400.0, "2022-01-01 00:00:00.003 +0000"], > ["Alex", 200.0, "2022-01-01 08:00:00.005 +0800"], > ["Emma", 300.0, "2022-01-01 00:00:00.007 +0000"], > ["Jack", 500.0, "2022-01-01 08:00:00.009 +0800"], > ["Alex", 450.0, "2022-01-01 00:00:00.011 +0000"], > ], > columns=["name", "avg_cost", "time"], > ) > env = StreamExecutionEnvironment.get_execution_environment() > env.set_state_backend(HashMapStateBackend()) > t_env = StreamTableEnvironment.create(env) > input_table = t_env.from_pandas(input_data) > t_env.create_temporary_view("input_table", input_table) > time_format = "yyyy-MM-dd HH:mm:ss.SSS X" > output_table = t_env.sql_query( > f"SELECT *, UNIX_TIMESTAMP(`time`, '{time_format}') AS unix_time FROM > input_table" > ) > output_table.execute().print() > {code} > The actual output is > {code} > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > | op | name | avg_cost | > time | unix_time | > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > | +I | Alex | 100.0 | > 2022-01-01 08:00:00.001 +0800 | 1640995200 | > | +I | Emma | 400.0 | > 2022-01-01 00:00:00.003 +0000 | 1640995200 | > | +I | Alex | 200.0 | > 2022-01-01 08:00:00.005 +0800 | 1640995200 | > | +I | Emma | 300.0 | > 2022-01-01 00:00:00.007 +0000 | 1640995200 | > | +I | Jack | 500.0 | > 2022-01-01 08:00:00.009 +0800 | 1640995200 | > | +I | Alex | 450.0 | > 2022-01-01 00:00:00.011 +0000 | 1640995200 | > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > {code} > While the expected result is > {code:java} > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > | op | name | avg_cost | > time | unix_time | > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > | +I | Alex | 100.0 | > 2022-01-01 08:00:00.001 +0800 | 1640995200 | > | +I | Emma | 400.0 | > 2022-01-01 00:00:00.003 +0000 | 1640966400 | > | +I | Alex | 200.0 | > 2022-01-01 08:00:00.005 +0800 | 1640995200 | > | +I | Emma | 300.0 | > 2022-01-01 00:00:00.007 +0000 | 1640966400 | > | +I | Jack | 500.0 | > 2022-01-01 08:00:00.009 +0800 | 1640995200 | > | +I | Alex | 450.0 | > 2022-01-01 00:00:00.011 +0000 | 1640966400 | > +----+--------------------------------+--------------------------------+--------------------------------+----------------------+ > {code} > -- This message was sent by Atlassian Jira (v8.20.10#820010)