+1 to have shortcut types TIMESTAMP_LTZ, TIMESTAMP_TZ. Best, Jark
On Thu, 28 Jan 2021 at 17:32, Timo Walther <twal...@apache.org> wrote: > Hi Leonard, > > thanks for the great summary and the updated FLIP. I think using > TIMESTAMP_LTZ for CURRENT_TIMESTAMP/PROCTIME/ROWTIME is a good long-term > solution. I also discussed this with people of different backgrounds > internally and everybody seems to agree to the proposed design. I hope > we can have a stable implementation in 1.13 because a lot of locations > will be touched for this change: time attributes, watermark generators, > connectors, formats, converters, functions, windows. > > The FLIP is in a very good shape. I think we can start a voting soon if > there are no objections. I have some last comments: > > 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP and not > TIMESTAMP_LTZ? Maybe we should quickly list also LOCALTIME/LOCALDATE and > LOCALTIMESTAMP for completeness. > > 2) Shall we add aliases for the timestamp types as part of this FLIP? I > see Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ [1]. > I think the discussion was quite cumbersome with the full string of > `TIMESTAMP WITH LOCAL TIME ZONE`. With this FLIP we are making this type > even more prominent. And important concepts should have a short name > because they are used frequently. According to the FLIP, we are > introducing the abbriviation already in function names like > `TO_TIMESTAMP_LTZ`. `TIMESTAMP_LTZ` could be treated similar to `STRING` > for `VARCHAR(MAX_INT)`, the serializable string representation would not > change. > > 3) I'm fine with supporting all conversion classes like > java.time.LocalDateTime, java.sql.Timestamp that TimestampType supported > for LocalZonedTimestampType. But we agree that Instant stays the > default conversion class right? The default extraction defined in [2] > will not change, correct? > > 4) I would remove the comment "Flink supports TIME-related types with > precision well", because unfortunately this is still not correct. We > still have issues with TIME(9), it would be great if someone can finally > fix that though. Maybe the implementation of this FLIP would be a good > time to fix this issue. > > Regards, > Timo > > > [1] > > https://docs.snowflake.com/en/sql-reference/data-types-datetime.html#timestamp-ltz-timestamp-ntz-timestamp-tz > > [2] > > https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/utils/ClassDataTypeConverter.java > > On 28.01.21 03:18, Jark Wu wrote: > > Thanks Leonard for the further investigation. > > > > I think we all agree we should correct the return value of > > CURRENT_TIMESTAMP. > > Regarding the return type of CURRENT_TIMESTAMP, I also agree > TIMESTAMP_LTZ > > would be more worldwide useful. This may need more effort, but if this is > > the right direction, we should do it. > > > > Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns > > TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ. > > Otherwise, CURRENT_TIME will be quite special and strange. > > Thus I think it has to return TIME type. Given that we already have > > CURRENT_DATE which returns > > DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT TIME > ZONE > > for CURRENT_TIME. > > > > In a word, the updated FLIP looks good to me. I especially like the > > proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]). > > This will be very convenient to define rowtime on a long value which is a > > very common case and has been complained a lot in mailing list. > > > > > > Best, > > Jark > > > > > > > > > > > > On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com> wrote: > > > >> Thanks Leonard for the detailed response and also the bad case about > option > >> 1, these all > >> make sense to me. > >> > >> Also nice catch about conversion support of LocalZonedTimestampType, I > >> think it actually > >> makes sense to support java.sql.Timestamp as well as > >> java.time.LocalDateTime. It also has > >> a slight benefit that we might have a chance to run the udf which took > them > >> as input parameter > >> after we change the return type. > >> > >> Regarding to the return type of CURRENT_TIME, I also think timezone > >> information is not useful. > >> To not expand this FLIP further, I'm lean to keep it as it is. > >> > >> Best, > >> Kurt > >> > >> > >> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com> wrote: > >> > >>> Hi, All > >>> > >>> Thanks for your comments. I think all of the thread have agreed that: > >>> (1) The return values of > CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME() > >>> are wrong. > >>> (2) The LOCALTIME/LOCALTIMESTAMP and CURRENT_TIME/CURRENT_TIMESTAMP > >> should > >>> be different whether from SQL standard’s perspective or mature systems. > >>> (3) The semantics of three TIMESTAMP types in Flink SQL follows the SQL > >>> standard and also keeps the same with other 'good' vendors. > >>> TIMESTAMP => A literal in > >>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not contain > >> timezone > >>> info, can not represent an absolute time point. > >>> TIMESTAMP WITH LOCAL ZONE => Records the elapsed time from > absolute > >>> time point origin, can represent an absolute time point, requires local > >>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format. > >>> TIMESTAMP WITH TIME ZONE => Consists of time zone info and a > >>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can represent > >> an > >>> absolute time point. > >>> > >>> > >>> Currently we've two ways to correct > >>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME(). > >>> > >>> option (1): As the FLIP proposed, change the return value from UTC > >>> timezone to local timezone. > >>> Pros: (1) The change looks smaller to users and developers > (2) > >>> There're many SQL engines adopted this way > >>> Cons: (1) connector devs may confuse the underlying value of > >>> TimestampData which needs to change according to data type (2) I > thought > >>> about this weekend. Unfortunately I found a bad case: > >>> > >>> The proposal is fine if we only use it in FLINK SQL world, but we need > to > >>> consider the conversion between Table/DataStream, assume a record > >> produced > >>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44' and the Flink > SQL > >>> processes the data with session time zone 'UTC+8', if the sql program > >> need > >>> to convert the Table to DataStream, then we need to calculate the > >> timestamp > >>> in StreamRecord with session time zone (UTC+8), then we will get 44 in > >>> DataStream program, but it is wrong because the expected value should > be > >> (8 > >>> * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in > >> Flink > >>> are based on UTC+0, when correct the PROCTIME() function, the better > way > >> is > >>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with > >> time > >>> based on UTC+0 and can be expressed with local timezone. > >>> > >>> option (2) : As we considered in the FLIP as well as @Timo suggested, > >>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the expressed > >>> value depends on the local time zone. > >>> Pros: (1) Make Flink SQL more close to SQL standard (2) Can > deal > >>> the conversion between Table/DataStream well > >>> Cons: (1) We need to discuss the return value/type of > >> CURRENT_TIME > >>> function (2) The change is bigger to users, we need to support > TIMESTAMP > >>> WITH LOCAL TIME ZONE in connectors/formats as well as custom > connectors. > >>> (3)The TIMESTAMP WITH LOCAL TIME ZONE support is > weak > >>> in Flink, thus we need some improvement,but the workload does not > matter > >>> as long as we are doing the right thing ^_^ > >>> > >>> Due to the above bad case for option (1). I think option 2 should be > >>> adopted, > >>> But we also need to consider some problems: > >>> (1) More conversion classes like LocalDateTime, sql.Timestamp should be > >>> supported for LocalZonedTimestampType to resolve the UDF compatibility > >> issue > >>> (2) The timezone offset for window size of one day should still be > >>> considered > >>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL TIME > ZONE > >>> well and we also should record in document > >>> I’ll update these sections of FLIP-162. > >>> > >>> > >>> > >>> We also need to discuss the CURRENT_TIME function. I know the standard > >> way > >>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME ZONE), but > >> we > >>> don't support this type yet and I don't see strong motivation to > support > >> it > >>> so far. > >>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not represent an > >>> absolute time point which should be considered as a string consisting > of > >> a > >>> time with 'HH:mm:ss' format and time zone info. We have several > options > >>> for this: > >>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all Flink SQL > >>> functions follow the standard well, in this way, we need to offer some > >>> guidance for user upgrading Flink versions. > >>> (2) We can also support it from a user's perspective who has used > >>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also returns > >>> TIME type. > >>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to > >>> CURRENT_TIMESTAMP as Calcite did. > >>> > >>> I can image (1) which we don't want to left a bad smell in Flink SQL, > >> and > >>> I also accept (2) because I think users do not consider time zone > issues > >>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info in time > is > >>> not very useful. > >>> > >>> I don’t have a strong opinion for them. What do others think? > >>> > >>> > >>> I hope I've addressed your concerns. @Timo @Kurt > >>> > >>> Best, > >>> Leonard > >>> > >>> > >>> > >>>> Most of the mature systems have a clear difference between > >>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or Hive as > a > >>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME ZONE. As > I > >>> mentioned in the last comment, I could also imagine this behavior for > >>> Flink. But in any case, there should be some time zone information > >>> considered in order to cast to all other types. > >>>> > >>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL > >>> standard, but > >>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping > >>> functions which > >>>>>>> SQL standard supported and introducing a replacement which SQL > >>> standard not > >>>>>>> reminded. > >>>> > >>>> We can still add those functions in the future. But since we don't > >> offer > >>> a TIME WITH TIME ZONE, it is better to not support this function at all > >> for > >>> now. And by the way, this is exactly the behavior that also Microsoft > SQL > >>> Server does: it also just supports CURRENT_TIMESTAMP (but it returns > >>> TIMESTAMP without a zone which completes the confusion). > >>>> > >>>>>>> I also agree returning TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME > >>> has > >>>>>>> more clear semantics, but I realized that user didn’t care the type > >>> but > >>>>>>> more about the expressed value they saw, and change the type from > >>> TIMESTAMP > >>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need > >>>>>>> consider all places where the TIMESTAMP type used > >>>> > >>>> From a UDF perspective, I think nothing will change. The new type > >> system > >>> and type inference were designed to support all these cases. There is a > >>> reason why Java has adopted Joda time, because it is hard to come up > >> with a > >>> good time library. That's why also we and the other Hadoop ecosystem > >> folks > >>> have decided for 3 different kinds of LocalDateTime, ZonedDateTime, and > >>> Instance. It makes the library more complex, but time is a complex > topic. > >>>> > >>>> I also doubt that many users work with only one time zone. Take the US > >>> as an example, a country with 3 different timezones. Somebody working > >> with > >>> US data cannot properly see the data points with just LOCAL TIME ZONE. > >> But > >>> on the other hand, a lot of event data is stored using a UTC timestamp. > >>>> > >>>> > >>>>>> Before jumping into technique details, let's take a step back to > >>> discuss > >>>>>> user experience. > >>>>>> > >>>>>> The first important question is what kind of date and time will > >> Flink > >>>>>> display when users call > >>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are > >>> similar). > >>>>>> > >>>>>> Should it always display the date and time in UTC or in the user's > >>> time > >>>>>> zone? > >>>> > >>>> @Kurt: I think we all agree that the current behavior with just > showing > >>> UTC is wrong. Also, we all agree that when calling CURRENT_TIMESTAMP or > >>> PROCTIME a user would like to see the time in it's current time zone. > >>>> > >>>> As you said, "my wall clock time". > >>>> > >>>> However, the question is what is the data type of what you "see". If > >> you > >>> pass this record on to a different system, operator, or different > >> cluster, > >>> should the "my" get lost or materialized into the record? > >>>> > >>>> TIMESTAMP -> completely lost and could cause confusion in a different > >>> system > >>>> > >>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct, so you > >>> can provide a new local time zone > >>>> > >>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted > >>>> > >>>> Regards, > >>>> Timo > >>>> > >>>> > >>>> > >>>> > >>>> On 22.01.21 09:38, Kurt Young wrote: > >>>>> Forgot one more thing. Continue with displaying in UTC. As a user, if > >>> Flink > >>>>> want to display the timestamp > >>>>> in UTC, why don't we offer something like UTC_TIMESTAMP? > >>>>> Best, > >>>>> Kurt > >>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com> wrote: > >>>>>> Before jumping into technique details, let's take a step back to > >>> discuss > >>>>>> user experience. > >>>>>> > >>>>>> The first important question is what kind of date and time will > Flink > >>>>>> display when users call > >>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are > >>> similar). > >>>>>> > >>>>>> Should it always display the date and time in UTC or in the user's > >> time > >>>>>> zone? I think this part is the > >>>>>> reason that surprised lots of users. If we forget about the type and > >>>>>> internal representation of these > >>>>>> two methods, as a user, my instinct tells me that these two methods > >>> should > >>>>>> display my wall clock time. > >>>>>> > >>>>>> Display time in UTC? I'm not sure, why I should care about UTC time? > >> I > >>>>>> want to get my current timestamp. > >>>>>> For those users who have never gone abroad, they might not even be > >>> able to > >>>>>> realize that this is affected > >>>>>> by the time zone. > >>>>>> > >>>>>> Best, > >>>>>> Kurt > >>>>>> > >>>>>> > >>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <xbjt...@gmail.com> > >> wrote: > >>>>>> > >>>>>>> Thanks @Timo for the detailed reply, let's go on this topic on this > >>>>>>> discussion, I've merged all mails to this discussion. > >>>>>>> > >>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP > >>>>>>>> > >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP > >>>>>>> > >>>>>>>> > >>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP > >>>>>>>> > >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP > >>>>>>>> > >>>>>>>> I'm very sceptical about this behavior. Almost all mature systems > >>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake) > >>> use a > >>>>>>> data type with some degree of time zone information encoded. In a > >>>>>>> globalized world with businesses spanning different regions, I > think > >>> we > >>>>>>> should do this as well. There should be a difference between > >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to > >>> choose > >>>>>>> which behavior they prefer for their pipeline. > >>>>>>> > >>>>>>> > >>>>>>> I know that the two series should be different at first glance, but > >>>>>>> different SQL engines can have their own explanations,for example, > >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in Snowflake[1] > >> and > >>> has > >>>>>>> no difference, and Spark only supports the later one and doesn’t > >>> support > >>>>>>> LOCALTIME/LOCALTIMESTAMP[2]. > >>>>>>> > >>>>>>> > >>>>>>>> If we would design this from scatch, I would suggest the > following: > >>>>>>>> > >>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE / > >>>>>>> LOCALTIME for materialized timestamp parts > >>>>>>> > >>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL > >> standard, > >>> but > >>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping > >> functions > >>> which > >>>>>>> SQL standard supported and introducing a replacement which SQL > >>> standard not > >>>>>>> reminded. > >>>>>>> > >>>>>>> > >>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to > >>>>>>> materialize all session time information into every record. It it > >> the > >>> most > >>>>>>> generic data type and allows to cast to all other timestamp data > >>> types. > >>>>>>> This generic ability can be used for filter predicates as well > >> either > >>>>>>> through implicit or explicit casting. > >>>>>>> > >>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to > >> describe > >>> a > >>>>>>> time point, but the type TIMESTAMP can cast to all other timestamp > >>> data > >>>>>>> types combining with session time zone as well, and it also can be > >>> used for > >>>>>>> filter predicates. For type casting between BIGINT and TIMESTAMP, I > >>> think > >>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP() is more > >>> clear. > >>>>>>> > >>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value. > >> Both > >>>>>>> System.currentMillis() and our watermark system work on long > values. > >>> Those > >>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main > >>> calculation > >>>>>>> should always happen based on UTC. > >>>>>>>> We discussed it in a different thread, but we should allow > PROCTIME > >>>>>>> globally. People need a way to create instances of TIMESTAMP WITH > >>> LOCAL > >>>>>>> TIME ZONE. This is not considered in the current design doc. > >>>>>>>> Many pipelines contain UTC timestamps and thus it should be easy > to > >>>>>>> create one. > >>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work with this > >>> type > >>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME ZONE > >>> accepts all > >>>>>>> timestamp data types as casting target [1]. We could allow > TIMESTAMP > >>> WITH > >>>>>>> TIME ZONE in the future for ROWTIME. > >>>>>>>> In any case, windows should simply adapt their behavior to the > >> passed > >>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is > >>> defined by > >>>>>>> considering the current session time zone. > >>>>>>> > >>>>>>> I also agree returning TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME > >>> has > >>>>>>> more clear semantics, but I realized that user didn’t care the type > >>> but > >>>>>>> more about the expressed value they saw, and change the type from > >>> TIMESTAMP > >>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need > >>>>>>> consider all places where the TIMESTAMP type used, and many builtin > >>>>>>> functions and UDFs doest not support TIMESTAMP WITH LOCAL TIME > ZONE > >>> type. > >>>>>>> That means both user and Flink devs need to refactor the code(UDF, > >>> builtin > >>>>>>> functions, sql pipeline), to be honest, I didn’t see strong > >>> motivation that > >>>>>>> we have to do the pretty big refactor from user’s perspective and > >>>>>>> developer’s perspective. > >>>>>>> > >>>>>>> In one word, both your suggestion and my proposal can resolve > almost > >>> all > >>>>>>> user problems,the divergence is whether we need to spend pretty > >>> energy just > >>>>>>> to get a bit more accurate semantics? I think we need a tradeoff. > >>>>>>> > >>>>>>> > >>>>>>> Best, > >>>>>>> Leonard > >>>>>>> [1] > >>>>>>> > >>> > https://trino.io/docs/current/functions/datetime.html#current_timestamp > >> < > >>>>>>> > >>> > https://trino.io/docs/current/functions/datetime.html#current_timestamp> > >>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 < > >>>>>>> https://issues.apache.org/jira/browse/SPARK-30374> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 2021-01-22,00:53,Timo Walther <twal...@apache.org> : > >>>>>>>> > >>>>>>>> Hi Leonard, > >>>>>>>> > >>>>>>>> thanks for working on this topic. I agree that time handling is > not > >>>>>>> easy in Flink at the moment. We added new time data types (and some > >>> are > >>>>>>> still not supported which even further complicates things like > >>> TIME(9)). We > >>>>>>> should definitely improve this situation for users. > >>>>>>>> > >>>>>>>> This is a pretty opinionated topic and it seems that the SQL > >> standard > >>>>>>> is not really deciding this but is at least supporting. So let me > >>> express > >>>>>>> my opinion for the most important functions: > >>>>>>>> > >>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP > >>>>>>>> > >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP > >>>>>>>> > >>>>>>>> I think those are the most obvious ones because the LOCAL > indicates > >>>>>>> that the locality should be materialized into the result and any > >> time > >>> zone > >>>>>>> information (coming from session config or data) is not important > >>>>>>> afterwards. > >>>>>>>> > >>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP > >>>>>>>> > >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP > >>>>>>>> > >>>>>>>> I'm very sceptical about this behavior. Almost all mature systems > >>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake) > >>> use a > >>>>>>> data type with some degree of time zone information encoded. In a > >>>>>>> globalized world with businesses spanning different regions, I > think > >>> we > >>>>>>> should do this as well. There should be a difference between > >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to > >>> choose > >>>>>>> which behavior they prefer for their pipeline. > >>>>>>>> > >>>>>>>> If we would design this from scatch, I would suggest the > following: > >>>>>>>> > >>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE / > >>>>>>> LOCALTIME for materialized timestamp parts > >>>>>>>> > >>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to > >>>>>>> materialize all session time information into every record. It it > >> the > >>> most > >>>>>>> generic data type and allows to cast to all other timestamp data > >>> types. > >>>>>>> This generic ability can be used for filter predicates as well > >> either > >>>>>>> through implicit or explicit casting. > >>>>>>>> > >>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value. > >> Both > >>>>>>> System.currentMillis() and our watermark system work on long > values. > >>> Those > >>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main > >>> calculation > >>>>>>> should always happen based on UTC. We discussed it in a different > >>> thread, > >>>>>>> but we should allow PROCTIME globally. People need a way to create > >>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not considered > >>> in the > >>>>>>> current design doc. Many pipelines contain UTC timestamps and thus > >> it > >>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP and > >>>>>>> LOCALTIMESTAMP can work with this type because we should remember > >> that > >>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data types as > >>> casting > >>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the future > >> for > >>>>>>> ROWTIME. > >>>>>>>> > >>>>>>>> In any case, windows should simply adapt their behavior to the > >> passed > >>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is > >>> defined by > >>>>>>> considering the current session time zone. > >>>>>>>> > >>>>>>>> If we would like to design this with less effort required, we > could > >>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also for > >>>>>>> CURRENT_TIMESTAMP. > >>>>>>>> > >>>>>>>> > >>>>>>>> I will try to involve more people into this discussion. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> Timo > >>>>>>>> > >>>>>>>> [1] > >>>>>>> > >>> > >> > https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3 > >>>>>>> < > >>>>>>> > >>> > >> > https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3 > >>>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> : > >>>>>>>>> Before the changes, as I am writing this reply, the local time > >> here > >>> is > >>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8). > >>>>>>>>> And I tried these 5 functions in sql client, and got: > >>>>>>>>> > >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, > >>> CURRENT_DATE, > >>>>>>> CURRENT_TIME; > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> | EXPR$0 | EXPR$1 | > >>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 | > >>>>>>> 2021-01-21T04:03:35.228 | 2021-01-21 | 04:03:35.228 | > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> After the changes, the expected behavior will change to: > >>>>>>>>> > >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, > >>> CURRENT_DATE, > >>>>>>> CURRENT_TIME; > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> | EXPR$0 | EXPR$1 | > >>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 | > >>>>>>> 2021-01-21T12:03:35.228 | 2021-01-21 | 12:03:35.228 | > >>>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still > >> be > >>>>>>> TIMESTAMP; > >>>>>>>> > >>>>>>>> To Kurt, thanks for the intuitive case, it really clear, you’re > >>> wright > >>>>>>> that I want to propose to change the return value of these > >> functions. > >>> It’s > >>>>>>> the most important part of the topic from user's perspective. > >>>>>>>> > >>>>>>>>> I think this definitely deserves a FLIP. > >>>>>>>> To Jark, nice suggestion, I prepared a FLIP for this topic, and > >> will > >>>>>>> start the FLIP discussion soon. > >>>>>>>> > >>>>>>>>>> If use the default Flink SQL, the window time range of the > >>>>>>>>>> statistics is incorrect, then the statistical results will > >>> naturally > >>>>>>> be > >>>>>>>>>> incorrect. > >>>>>>>> To zhisheng, sorry to hear that this problem influenced your > >>> production > >>>>>>> jobs, Could you share your SQL pattern? we can have more inputs > >> and > >>> try > >>>>>>> to resolve them. > >>>>>>>> > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Leonard > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 2021-01-21,14:19,Jark Wu <imj...@gmail.com> : > >>>>>>>> > >>>>>>>> Great examples to understand the problem and the proposed changes, > >>>>>>> @Kurt! > >>>>>>>> > >>>>>>>> Thanks Leonard for investigating this problem. > >>>>>>>> The time-zone problems around time functions and windows have > >>> bothered a > >>>>>>>> lot of users. It's time to fix them! > >>>>>>>> > >>>>>>>> The return value changes sound reasonable to me, and keeping the > >>> return > >>>>>>>> type unchanged will minimize the surprise to the users. > >>>>>>>> Besides that, I think it would be better to mention how this > >> affects > >>> the > >>>>>>>> window behaviors, and the interoperability with DataStream. > >>>>>>>> > >>>>>>>> I think this definitely deserves a FLIP. > >>>>>>>> > >>>>>>>> ==================================================== > >>>>>>>> > >>>>>>>> Hi zhisheng, > >>>>>>>> > >>>>>>>> Do you have examples to illustrate which case will get the wrong > >>> window > >>>>>>>> boundaries? > >>>>>>>> That will help to verify whether the proposed changes can solve > >> your > >>>>>>>> problem. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Jark > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> : > >>>>>>>> > >>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At present, > >>>>>>> there are many Flink jobs in our production environment that are > >> used > >>> to > >>>>>>> count day-level reports (eg: count PV/UV ). > >>>>>>>> > >>>>>>>> If use the default Flink SQL, the window time range of the > >>>>>>> statistics is incorrect, then the statistical results will > naturally > >>> be > >>>>>>> incorrect. > >>>>>>>> > >>>>>>>> The user needs to deal with the time zone manually in order to > >> solve > >>>>>>> the problem. > >>>>>>>> > >>>>>>>> If Flink itself can solve these time zone issues, then I think it > >>> will > >>>>>>> be user-friendly. > >>>>>>>> > >>>>>>>> Thank you > >>>>>>>> > >>>>>>>> Best!; > >>>>>>>> zhisheng > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>>> 2021-01-21,12:11,Kurt Young <ykt...@gmail.com> : > >>>>>>>> > >>>>>>>> cc this to user & user-zh mailing list because this will affect > >> lots > >>> of > >>>>>>> users, and also quite a lot of users > >>>>>>>> were asking questions around this topic. > >>>>>>>> > >>>>>>>> Let me try to understand this from user's perspective. > >>>>>>>> > >>>>>>>> Your proposal will affect five functions, which are: > >>>>>>>> PROCTIME() > >>>>>>>> NOW() > >>>>>>>> CURRENT_DATE > >>>>>>>> CURRENT_TIME > >>>>>>>> CURRENT_TIMESTAMP > >>>>>>>> Before the changes, as I am writing this reply, the local time > here > >>> is > >>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8). > >>>>>>>> And I tried these 5 functions in sql client, and got: > >>>>>>>> > >>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, > >> CURRENT_DATE, > >>>>>>> CURRENT_TIME; > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> | EXPR$0 | EXPR$1 | > >>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 | > >>>>>>> 2021-01-21T04:03:35.228 | 2021-01-21 | 04:03:35.228 | > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> After the changes, the expected behavior will change to: > >>>>>>>> > >>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP, > >> CURRENT_DATE, > >>>>>>> CURRENT_TIME; > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> | EXPR$0 | EXPR$1 | > >>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME | > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 | > >>>>>>> 2021-01-21T12:03:35.228 | 2021-01-21 | 12:03:35.228 | > >>>>>>>> > >>>>>>> > >>> > >> > +-------------------------+-------------------------+-------------------------+--------------+--------------+ > >>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still > be > >>>>>>> TIMESTAMP; > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Kurt > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>> > >>> > >>> > >> > > > >