Thanks all for sharing your opinions. Looks like we’ve reached a consensus about the topic.
@Timo: > 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP and not > TIMESTAMP_LTZ? Maybe we should quickly list also LOCALTIME/LOCALDATE and > LOCALTIMESTAMP for completeness. Yes, LOCALTIMESTAMP returns TIMESTAMP, LOCALTIME returns TIME, the behavior of them is clear so I just listed them in the excel[1] of this FLIP references. > 2) Shall we add aliases for the timestamp types as part of this FLIP? I see > Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ [1]. I think > the discussion was quite cumbersome with the full string of `TIMESTAMP WITH > LOCAL TIME ZONE`. With this FLIP we are making this type even more prominent. > And important concepts should have a short name because they are used > frequently. According to the FLIP, we are introducing the abbriviation > already in function names like `TO_TIMESTAMP_LTZ`. `TIMESTAMP_LTZ` could be > treated similar to `STRING` for `VARCHAR(MAX_INT)`, the serializable string > representation would not change. @Timo @Jark Nice idea, I also suffered from the long name during the discussions, the abbreviation will not only help us, but also makes it more convenient for users. I list the abbreviation name mapping to support: TIMESTAMP WITHOUT TIME ZONE <=> TIMESTAMP_NTZ (which synonyms TIMESTAMP) TIMESTAMP WITH LOCAL TIME ZONE <=> TIMESTAMP_LTZ TIMESTAMP WITH TIME ZONE <=> TIMESTAMP_TZ (supports them in the future) > 3) I'm fine with supporting all conversion classes like > java.time.LocalDateTime, java.sql.Timestamp that TimestampType supported for > LocalZonedTimestampType. But we agree that Instant stays the default > conversion class right? The default extraction defined in [2] will not > change, correct? Yes, Instant stays the default conversion class. The default > 4) I would remove the comment "Flink supports TIME-related types with > precision well", because unfortunately this is still not correct. We still > have issues with TIME(9), it would be great if someone can finally fix that > though. Maybe the implementation of this FLIP would be a good time to fix > this issue. You’re right, TIME(9) is not supported yet, I'll take account of TIME(9) to the scope of this FLIP. I’ve updated this FLIP[2] according your suggestions @Jark @Timo I’ll start the vote soon if there’re no objections. Best, Leonard [1] https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing <https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing> [2] https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior <https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior> > > On 28.01.21 03:18, Jark Wu wrote: >> Thanks Leonard for the further investigation. >> I think we all agree we should correct the return value of >> CURRENT_TIMESTAMP. >> Regarding the return type of CURRENT_TIMESTAMP, I also agree TIMESTAMP_LTZ >> would be more worldwide useful. This may need more effort, but if this is >> the right direction, we should do it. >> Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns >> TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ. >> Otherwise, CURRENT_TIME will be quite special and strange. >> Thus I think it has to return TIME type. Given that we already have >> CURRENT_DATE which returns >> DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT TIME ZONE >> for CURRENT_TIME. >> In a word, the updated FLIP looks good to me. I especially like the >> proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]). >> This will be very convenient to define rowtime on a long value which is a >> very common case and has been complained a lot in mailing list. >> Best, >> Jark >> On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com> wrote: >>> Thanks Leonard for the detailed response and also the bad case about option >>> 1, these all >>> make sense to me. >>> >>> Also nice catch about conversion support of LocalZonedTimestampType, I >>> think it actually >>> makes sense to support java.sql.Timestamp as well as >>> java.time.LocalDateTime. It also has >>> a slight benefit that we might have a chance to run the udf which took them >>> as input parameter >>> after we change the return type. >>> >>> Regarding to the return type of CURRENT_TIME, I also think timezone >>> information is not useful. >>> To not expand this FLIP further, I'm lean to keep it as it is. >>> >>> Best, >>> Kurt >>> >>> >>> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com> wrote: >>> >>>> Hi, All >>>> >>>> Thanks for your comments. I think all of the thread have agreed that: >>>> (1) The return values of CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME() >>>> are wrong. >>>> (2) The LOCALTIME/LOCALTIMESTAMP and CURRENT_TIME/CURRENT_TIMESTAMP >>> should >>>> be different whether from SQL standard’s perspective or mature systems. >>>> (3) The semantics of three TIMESTAMP types in Flink SQL follows the SQL >>>> standard and also keeps the same with other 'good' vendors. >>>> TIMESTAMP => A literal in >>>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not contain >>> timezone >>>> info, can not represent an absolute time point. >>>> TIMESTAMP WITH LOCAL ZONE => Records the elapsed time from absolute >>>> time point origin, can represent an absolute time point, requires local >>>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format. >>>> TIMESTAMP WITH TIME ZONE => Consists of time zone info and a >>>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can represent >>> an >>>> absolute time point. >>>> >>>> >>>> Currently we've two ways to correct >>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME(). >>>> >>>> option (1): As the FLIP proposed, change the return value from UTC >>>> timezone to local timezone. >>>> Pros: (1) The change looks smaller to users and developers (2) >>>> There're many SQL engines adopted this way >>>> Cons: (1) connector devs may confuse the underlying value of >>>> TimestampData which needs to change according to data type (2) I thought >>>> about this weekend. Unfortunately I found a bad case: >>>> >>>> The proposal is fine if we only use it in FLINK SQL world, but we need to >>>> consider the conversion between Table/DataStream, assume a record >>> produced >>>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44' and the Flink SQL >>>> processes the data with session time zone 'UTC+8', if the sql program >>> need >>>> to convert the Table to DataStream, then we need to calculate the >>> timestamp >>>> in StreamRecord with session time zone (UTC+8), then we will get 44 in >>>> DataStream program, but it is wrong because the expected value should be >>> (8 >>>> * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in >>> Flink >>>> are based on UTC+0, when correct the PROCTIME() function, the better way >>> is >>>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with >>> time >>>> based on UTC+0 and can be expressed with local timezone. >>>> >>>> option (2) : As we considered in the FLIP as well as @Timo suggested, >>>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the expressed >>>> value depends on the local time zone. >>>> Pros: (1) Make Flink SQL more close to SQL standard (2) Can deal >>>> the conversion between Table/DataStream well >>>> Cons: (1) We need to discuss the return value/type of >>> CURRENT_TIME >>>> function (2) The change is bigger to users, we need to support TIMESTAMP >>>> WITH LOCAL TIME ZONE in connectors/formats as well as custom connectors. >>>> (3)The TIMESTAMP WITH LOCAL TIME ZONE support is weak >>>> in Flink, thus we need some improvement,but the workload does not matter >>>> as long as we are doing the right thing ^_^ >>>> >>>> Due to the above bad case for option (1). I think option 2 should be >>>> adopted, >>>> But we also need to consider some problems: >>>> (1) More conversion classes like LocalDateTime, sql.Timestamp should be >>>> supported for LocalZonedTimestampType to resolve the UDF compatibility >>> issue >>>> (2) The timezone offset for window size of one day should still be >>>> considered >>>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL TIME ZONE >>>> well and we also should record in document >>>> I’ll update these sections of FLIP-162. >>>> >>>> >>>> >>>> We also need to discuss the CURRENT_TIME function. I know the standard >>> way >>>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME ZONE), but >>> we >>>> don't support this type yet and I don't see strong motivation to support >>> it >>>> so far. >>>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not represent an >>>> absolute time point which should be considered as a string consisting of >>> a >>>> time with 'HH:mm:ss' format and time zone info. We have several options >>>> for this: >>>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all Flink SQL >>>> functions follow the standard well, in this way, we need to offer some >>>> guidance for user upgrading Flink versions. >>>> (2) We can also support it from a user's perspective who has used >>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also returns >>>> TIME type. >>>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to >>>> CURRENT_TIMESTAMP as Calcite did. >>>> >>>> I can image (1) which we don't want to left a bad smell in Flink SQL, >>> and >>>> I also accept (2) because I think users do not consider time zone issues >>>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info in time is >>>> not very useful. >>>> >>>> I don’t have a strong opinion for them. What do others think? >>>> >>>> >>>> I hope I've addressed your concerns. @Timo @Kurt >>>> >>>> Best, >>>> Leonard >>>> >>>> >>>> >>>>> Most of the mature systems have a clear difference between >>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or Hive as a >>>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME ZONE. As I >>>> mentioned in the last comment, I could also imagine this behavior for >>>> Flink. But in any case, there should be some time zone information >>>> considered in order to cast to all other types. >>>>> >>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL >>>> standard, but >>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping >>>> functions which >>>>>>>> SQL standard supported and introducing a replacement which SQL >>>> standard not >>>>>>>> reminded. >>>>> >>>>> We can still add those functions in the future. But since we don't >>> offer >>>> a TIME WITH TIME ZONE, it is better to not support this function at all >>> for >>>> now. And by the way, this is exactly the behavior that also Microsoft SQL >>>> Server does: it also just supports CURRENT_TIMESTAMP (but it returns >>>> TIMESTAMP without a zone which completes the confusion). >>>>> >>>>>>>> I also agree returning TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME >>>> has >>>>>>>> more clear semantics, but I realized that user didn’t care the type >>>> but >>>>>>>> more about the expressed value they saw, and change the type from >>>> TIMESTAMP >>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need >>>>>>>> consider all places where the TIMESTAMP type used >>>>> >>>>> From a UDF perspective, I think nothing will change. The new type >>> system >>>> and type inference were designed to support all these cases. There is a >>>> reason why Java has adopted Joda time, because it is hard to come up >>> with a >>>> good time library. That's why also we and the other Hadoop ecosystem >>> folks >>>> have decided for 3 different kinds of LocalDateTime, ZonedDateTime, and >>>> Instance. It makes the library more complex, but time is a complex topic. >>>>> >>>>> I also doubt that many users work with only one time zone. Take the US >>>> as an example, a country with 3 different timezones. Somebody working >>> with >>>> US data cannot properly see the data points with just LOCAL TIME ZONE. >>> But >>>> on the other hand, a lot of event data is stored using a UTC timestamp. >>>>> >>>>> >>>>>>> Before jumping into technique details, let's take a step back to >>>> discuss >>>>>>> user experience. >>>>>>> >>>>>>> The first important question is what kind of date and time will >>> Flink >>>>>>> display when users call >>>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are >>>> similar). >>>>>>> >>>>>>> Should it always display the date and time in UTC or in the user's >>>> time >>>>>>> zone? >>>>> >>>>> @Kurt: I think we all agree that the current behavior with just showing >>>> UTC is wrong. Also, we all agree that when calling CURRENT_TIMESTAMP or >>>> PROCTIME a user would like to see the time in it's current time zone. >>>>> >>>>> As you said, "my wall clock time". >>>>> >>>>> However, the question is what is the data type of what you "see". If >>> you >>>> pass this record on to a different system, operator, or different >>> cluster, >>>> should the "my" get lost or materialized into the record? >>>>> >>>>> TIMESTAMP -> completely lost and could cause confusion in a different >>>> system >>>>> >>>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct, so you >>>> can provide a new local time zone >>>>> >>>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted >>>>> >>>>> Regards, >>>>> Timo >>>>> >>>>> >>>>> >>>>> >>>>> On 22.01.21 09:38, Kurt Young wrote: >>>>>> Forgot one more thing. Continue with displaying in UTC. As a user, if >>>> Flink >>>>>> want to display the timestamp >>>>>> in UTC, why don't we offer something like UTC_TIMESTAMP? >>>>>> Best, >>>>>> Kurt >>>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com> wrote: >>>>>>> Before jumping into technique details, let's take a step back to >>>> discuss >>>>>>> user experience. >>>>>>> >>>>>>> The first important question is what kind of date and time will Flink >>>>>>> display when users call >>>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are >>>> similar). >>>>>>> >>>>>>> Should it always display the date and time in UTC or in the user's >>> time >>>>>>> zone? I think this part is the >>>>>>> reason that surprised lots of users. If we forget about the type and >>>>>>> internal representation of these >>>>>>> two methods, as a user, my instinct tells me that these two methods >>>> should >>>>>>> display my wall clock time. >>>>>>> >>>>>>> Display time in UTC? I'm not sure, why I should care about UTC time? >>> I >>>>>>> want to get my current timestamp. >>>>>>> For those users who have never gone abroad, they might not even be >>>> able to >>>>>>> realize that this is affected >>>>>>> by the time zone. >>>>>>> >>>>>>> Best, >>>>>>> Kurt >>>>>>> >>>>>>> >>>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <xbjt...@gmail.com> >>> wrote: >>>>>>> >>>>>>>> Thanks @Timo for the detailed reply, let's go on this topic on this >>>>>>>> discussion, I've merged all mails to this discussion. >>>>>>>> >>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP >>>>>>>>> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP >>>>>>>> >>>>>>>>> >>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP >>>>>>>>> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP >>>>>>>>> >>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems >>>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake) >>>> use a >>>>>>>> data type with some degree of time zone information encoded. In a >>>>>>>> globalized world with businesses spanning different regions, I think >>>> we >>>>>>>> should do this as well. There should be a difference between >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to >>>> choose >>>>>>>> which behavior they prefer for their pipeline. >>>>>>>> >>>>>>>> >>>>>>>> I know that the two series should be different at first glance, but >>>>>>>> different SQL engines can have their own explanations,for example, >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in Snowflake[1] >>> and >>>> has >>>>>>>> no difference, and Spark only supports the later one and doesn’t >>>> support >>>>>>>> LOCALTIME/LOCALTIMESTAMP[2]. >>>>>>>> >>>>>>>> >>>>>>>>> If we would design this from scatch, I would suggest the following: >>>>>>>>> >>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE / >>>>>>>> LOCALTIME for materialized timestamp parts >>>>>>>> >>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL >>> standard, >>>> but >>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping >>> functions >>>> which >>>>>>>> SQL standard supported and introducing a replacement which SQL >>>> standard not >>>>>>>> reminded. >>>>>>>> >>>>>>>> >>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to >>>>>>>> materialize all session time information into every record. It it >>> the >>>> most >>>>>>>> generic data type and allows to cast to all other timestamp data >>>> types. >>>>>>>> This generic ability can be used for filter predicates as well >>> either >>>>>>>> through implicit or explicit casting. >>>>>>>> >>>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to >>> describe >>>> a >>>>>>>> time point, but the type TIMESTAMP can cast to all other timestamp >>>> data >>>>>>>> types combining with session time zone as well, and it also can be >>>> used for >>>>>>>> filter predicates. For type casting between BIGINT and TIMESTAMP, I >>>> think >>>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP() is more >>>> clear. >>>>>>>> >>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value. >>> Both >>>>>>>> System.currentMillis() and our watermark system work on long values. >>>> Those >>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main >>>> calculation >>>>>>>> should always happen based on UTC. >>>>>>>>> We discussed it in a different thread, but we should allow PROCTIME >>>>>>>> globally. People need a way to create instances of TIMESTAMP WITH >>>> LOCAL >>>>>>>> TIME ZONE. This is not considered in the current design doc. >>>>>>>>> Many pipelines contain UTC timestamps and thus it should be easy to >>>>>>>> create one. >>>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work with this >>>> type >>>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME ZONE >>>> accepts all >>>>>>>> timestamp data types as casting target [1]. We could allow TIMESTAMP >>>> WITH >>>>>>>> TIME ZONE in the future for ROWTIME. >>>>>>>>> In any case, windows should simply adapt their behavior to the >>> passed >>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is >>>> defined by >>>>>>>> considering the current session time zone. >>>>>>>> >>>>>>>> I also agree returning TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME >>>> has >>>>>>>> more clear semantics, but I realized that user didn’t care the type >>>> but >>>>>>>> more about the expressed value they saw, and change the type from >>>> TIMESTAMP >>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need >>>>>>>> consider all places where the TIMESTAMP type used, and many builtin >>>>>>>> functions and UDFs doest not support TIMESTAMP WITH LOCAL TIME ZONE >>>> type. >>>>>>>> That means both user and Flink devs need to refactor the code(UDF, >>>> builtin >>>>>>>> functions, sql pipeline), to be honest, I didn’t see strong >>>> motivation that >>>>>>>> we have to do the pretty big refactor from user’s perspective and >>>>>>>> developer’s perspective. >>>>>>>> >>>>>>>> In one word, both your suggestion and my proposal can resolve almost >>>> all >>>>>>>> user problems,the divergence is whether we need to spend pretty >>>> energy just >>>>>>>> to get a bit more accurate semantics? I think we need a tradeoff. >>>>>>>> >>>>>>>> >>>>>>>> Best, >>>>>>>> Leonard >>>>>>>> [1] >>>>>>>> >>>> https://trino.io/docs/current/functions/datetime.html#current_timestamp >>> < >>>>>>>> >>>> https://trino.io/docs/current/functions/datetime.html#current_timestamp> >>>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 < >>>>>>>> https://issues.apache.org/jira/browse/SPARK-30374> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2021-01-22,00:53,Timo Walther <twal...@apache.org> : >>>>>>>>> >>>>>>>>> Hi Leonard, >>>>>>>>> >>>>>>>>> thanks for working on this topic. I agree that time handling is not >>>>>>>> easy in Flink at the moment. We added new time data types (and some >>>> are >>>>>>>> still not supported which even further complicates things like >>>> TIME(9)). We >>>>>>>> should definitely improve this situation for users. >>>>>>>>> >>>>>>>>> This is a pretty opinionated topic and it seems that the SQL >>> standard >>>>>>>> is not really deciding this but is at least supporting. So let me >>>> express >>>>>>>> my opinion for the most important functions: >>>>>>>>> >>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP >>>>>>>>> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP >>>>>>>>> >>>>>>>>> I think those are the most obvious ones because the LOCAL indicates >>>>>>>> that the locality should be materialized into the result and any >>> time >>>> zone >>>>>>>> information (coming from session config or data) is not important >>>>>>>> afterwards. >>>>>>>>> >>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP >>>>>>>>> >>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP >>>>>>>>> >>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems >>>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake) >>>> use a >>>>>>>> data type with some degree of time zone information encoded. In a >>>>>>>> globalized world with businesses spanning different regions, I think >>>> we >>>>>>>> should do this as well. There should be a difference between >>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to >>>> choose >>>>>>>> which behavior they prefer for their pipeline. >>>>>>>>> >>>>>>>>> If we would design this from scatch, I would suggest the following: >>>>>>>>> >>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE / >>>>>>>> LOCALTIME for materialized timestamp parts >>>>>>>>> >>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to >>>>>>>> materialize all session time information into every record. It it >>> the >>>> most >>>>>>>> generic data type and allows to cast to all other timestamp data >>>> types. >>>>>>>> This generic ability can be used for filter predicates as well >>> either >>>>>>>> through implicit or explicit casting. >>>>>>>>> >>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value. >>> Both >>>>>>>> System.currentMillis() and our watermark system work on long values. >>>> Those >>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main >>>> calculation >>>>>>>> should always happen based on UTC. We discussed it in a different >>>> thread, >>>>>>>> but we should allow PROCTIME globally. People need a way to create >>>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not considered >>>> in the >>>>>>>> current design doc. Many pipelines contain UTC timestamps and thus >>> it >>>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP and >>>>>>>> LOCALTIMESTAMP can work with this type because we should remember >>> that >>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data types as >>>> casting >>>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the future >>> for >>>>>>>> ROWTIME. >>>>>>>>> >>>>>>>>> In any case, windows should simply adapt their behavior to the >>> passed >>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is >>>> defined by >>>>>>>> considering the current session time zone. >>>>>>>>> >>>>>>>>> If we would like to design this with less effort required, we could >>>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also for >>>>>>>> CURRENT_TIMESTAMP. >>>>>>>>> >>>>>>>>> >>>>>>>>> I will try to involve more people into this discussion. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Timo >>>>>>>>> >>>>>>>>> [1] >>>>>>>> >>>> >>> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3 >>>>>>>> < >>>>>>>> >>>> >>> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3 >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> : >>>>>>>>>> Before the changes, as I am writing this reply, the local time >>> here >>>> is >>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8). >>>>>>>>>> And I tried these 5 functions in sql client, and got: >>>>>>>>>> >>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, >>>> CURRENT_DATE, >>>>>>>> CURRENT_TIME; >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> | EXPR$0 | EXPR$1 | >>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 | >>>>>>>> 2021-01-21T04:03:35.228 | 2021-01-21 | 04:03:35.228 | >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> After the changes, the expected behavior will change to: >>>>>>>>>> >>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, >>>> CURRENT_DATE, >>>>>>>> CURRENT_TIME; >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> | EXPR$0 | EXPR$1 | >>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 | >>>>>>>> 2021-01-21T12:03:35.228 | 2021-01-21 | 12:03:35.228 | >>>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still >>> be >>>>>>>> TIMESTAMP; >>>>>>>>> >>>>>>>>> To Kurt, thanks for the intuitive case, it really clear, you’re >>>> wright >>>>>>>> that I want to propose to change the return value of these >>> functions. >>>> It’s >>>>>>>> the most important part of the topic from user's perspective. >>>>>>>>> >>>>>>>>>> I think this definitely deserves a FLIP. >>>>>>>>> To Jark, nice suggestion, I prepared a FLIP for this topic, and >>> will >>>>>>>> start the FLIP discussion soon. >>>>>>>>> >>>>>>>>>>> If use the default Flink SQL, the window time range of the >>>>>>>>>>> statistics is incorrect, then the statistical results will >>>> naturally >>>>>>>> be >>>>>>>>>>> incorrect. >>>>>>>>> To zhisheng, sorry to hear that this problem influenced your >>>> production >>>>>>>> jobs, Could you share your SQL pattern? we can have more inputs >>> and >>>> try >>>>>>>> to resolve them. >>>>>>>>> >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Leonard >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2021-01-21,14:19,Jark Wu <imj...@gmail.com> : >>>>>>>>> >>>>>>>>> Great examples to understand the problem and the proposed changes, >>>>>>>> @Kurt! >>>>>>>>> >>>>>>>>> Thanks Leonard for investigating this problem. >>>>>>>>> The time-zone problems around time functions and windows have >>>> bothered a >>>>>>>>> lot of users. It's time to fix them! >>>>>>>>> >>>>>>>>> The return value changes sound reasonable to me, and keeping the >>>> return >>>>>>>>> type unchanged will minimize the surprise to the users. >>>>>>>>> Besides that, I think it would be better to mention how this >>> affects >>>> the >>>>>>>>> window behaviors, and the interoperability with DataStream. >>>>>>>>> >>>>>>>>> I think this definitely deserves a FLIP. >>>>>>>>> >>>>>>>>> ==================================================== >>>>>>>>> >>>>>>>>> Hi zhisheng, >>>>>>>>> >>>>>>>>> Do you have examples to illustrate which case will get the wrong >>>> window >>>>>>>>> boundaries? >>>>>>>>> That will help to verify whether the proposed changes can solve >>> your >>>>>>>>> problem. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Jark >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> : >>>>>>>>> >>>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At present, >>>>>>>> there are many Flink jobs in our production environment that are >>> used >>>> to >>>>>>>> count day-level reports (eg: count PV/UV ). >>>>>>>>> >>>>>>>>> If use the default Flink SQL, the window time range of the >>>>>>>> statistics is incorrect, then the statistical results will naturally >>>> be >>>>>>>> incorrect. >>>>>>>>> >>>>>>>>> The user needs to deal with the time zone manually in order to >>> solve >>>>>>>> the problem. >>>>>>>>> >>>>>>>>> If Flink itself can solve these time zone issues, then I think it >>>> will >>>>>>>> be user-friendly. >>>>>>>>> >>>>>>>>> Thank you >>>>>>>>> >>>>>>>>> Best!; >>>>>>>>> zhisheng >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> 2021-01-21,12:11,Kurt Young <ykt...@gmail.com> : >>>>>>>>> >>>>>>>>> cc this to user & user-zh mailing list because this will affect >>> lots >>>> of >>>>>>>> users, and also quite a lot of users >>>>>>>>> were asking questions around this topic. >>>>>>>>> >>>>>>>>> Let me try to understand this from user's perspective. >>>>>>>>> >>>>>>>>> Your proposal will affect five functions, which are: >>>>>>>>> PROCTIME() >>>>>>>>> NOW() >>>>>>>>> CURRENT_DATE >>>>>>>>> CURRENT_TIME >>>>>>>>> CURRENT_TIMESTAMP >>>>>>>>> Before the changes, as I am writing this reply, the local time here >>>> is >>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8). >>>>>>>>> And I tried these 5 functions in sql client, and got: >>>>>>>>> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, >>> CURRENT_DATE, >>>>>>>> CURRENT_TIME; >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> | EXPR$0 | EXPR$1 | >>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 | >>>>>>>> 2021-01-21T04:03:35.228 | 2021-01-21 | 04:03:35.228 | >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> After the changes, the expected behavior will change to: >>>>>>>>> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, >>> CURRENT_DATE, >>>>>>>> CURRENT_TIME; >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> | EXPR$0 | EXPR$1 | >>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 | >>>>>>>> 2021-01-21T12:03:35.228 | 2021-01-21 | 12:03:35.228 | >>>>>>>>> >>>>>>>> >>>> >>> +-------------------------+-------------------------+-------------------------+--------------+--------------+ >>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still be >>>>>>>> TIMESTAMP; >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Kurt >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>> >>>> >>>> >>> >