Now we have 2 discussion threads on 3 mailing lists. Which one should have prioity? Should I repost my large email here again?

I think it is good to inform and invite in the user mailing lists but let's keep the FLIP discussion on the dev@ ML only.

Regards,
Timo

On 21.01.21 16:50, Leonard Xu wrote:
Hello, everyone

I want to start the discussion of FLIP-162: Consistent Flink SQL time function behavior[1]. We’ve some initial discussion of several problematic functions in dev mail list[2], and I think it's the right time to resolve them by a FLIP. Currently some time function behaviors are wired to user, user can not get local date/time/timestamp in their local time zone for time functions:

  * CURRENT_DATE
  * CURRENT_TIME
  * CURRENT_TIMESTAMP
  * NOW()
  * PROCTIME()

Assume user's clock time is '*2021-01-20 07:52:52.270'* in Beijing time(UTC+8), currently the unexpected values are returned when user SELECT above functions in Flink SQL client

*Flink SQL> **SELECT NOW(), PROCTIME(), CURRENT_TIMESTAMP, CURRENT_DATE, CURRENT_TIME;*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*
*|**NOW() ** |              PROCTIME() | CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*
*| 2021-01-19T23:52:52.270 | **2021-01-19T23:52:52.270** | **2021-01-19T23:52:52.270** | 2021-01-19 | 23**:52:52.270** |*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*

Besides, the window with interval one day width based on PROCTIME() can not collect correct data that belongs to the date '2021-01-20', because some data was assigned to window '2021-01-19' due to the PROCTIME() does not return local TIMESTAMP as user expected.

These problems come from these time-related functions like PROCTIME(), NOW(), CURRENT_DATE, CURRENT_TIME and CURRENT_TIMESTAMP are returning time values based on UTC+0 time zone, this is an incorrect behavior from my investigation[3]. I Invested all Flink time-related functions and compared with other DB vendors like Pg,Presto, Hive, Spark, Snowflake, this topic will lead to a comparison of the three types, i.e.

  *   TIMESTAMP/TIMESTAMP WITHOUT TIME ZONE
  *   TIMESTAMP WITH LOCAL TIME ZONE
  *   TIMESTAMP WITH TIME ZONE

In order to better understand above three types, I wrote a document[4] to help understand them better. You will found the behavior of them is same with in Hadoop ecosystem from the document.*The document is detailed and pretty long, it’s necessary to make the semantics clear(You can focus on the FLIP and skip the document).*

In one word, to correct the behavior of above functions, we can change the function return type or function return value. Both of them are valid because SQL:2011 does not specify the function return type, and every SQL engine vendor has its own implementation. For example the CURRENT_TIMESTAMP function in the document[3], *Spark, Presto, Snowflake *have different behaviors.

*I tend to only change the return value for these problematic functions and introduce an option for compatibility consideration,* the detailed proposal can be found in FLIP-162[1]. After corrected these function, user can get their expected return values as following:

*Flink SQL> **SELECT NOW(), PROCTIME(), CURRENT_TIMESTAMP, CURRENT_DATE, CURRENT_TIME;*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*
*|**NOW() ** |              PROCTIME() | CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*
*| 2021-01-20T07:52:52.270 | **2021-01-20T07:52:52.270** | **2021-01-20T07:52:52.270** | 2021-01-20 | **07:52:52.270** |*
*+-------------------------+-------------------------+-------------------------+--------------+--------------+*

Looking forward to your feedback.

Best,
Leonard

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior <https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior> [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-time-related-function-behavior-in-Flink-SQL-tc47989.html <http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Correct-time-related-function-behavior-in-Flink-SQL-tc47989.html> [3] https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing <https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing> [4] https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing <https://docs.google.com/document/d/1iY3eatV8LBjmF0gWh2JYrQR0FlTadsSeuCsksOVp_iA/edit?usp=sharing>


Reply via email to