+1 to have shortcut types TIMESTAMP_LTZ, TIMESTAMP_TZ.

Best,
Jark


On Thu, 28 Jan 2021 at 17:32, Timo Walther <twal...@apache.org> wrote:

> Hi Leonard,
>
> thanks for the great summary and the updated FLIP. I think using
> TIMESTAMP_LTZ for CURRENT_TIMESTAMP/PROCTIME/ROWTIME is a good long-term
> solution. I also discussed this with people of different backgrounds
> internally and everybody seems to agree to the proposed design. I hope
> we can have a stable implementation in 1.13 because a lot of locations
> will be touched for this change: time attributes, watermark generators,
> connectors, formats, converters, functions, windows.
>
> The FLIP is in a very good shape. I think we can start a voting soon if
> there are no objections. I have some last comments:
>
> 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP and not
> TIMESTAMP_LTZ? Maybe we should quickly list also LOCALTIME/LOCALDATE and
> LOCALTIMESTAMP for completeness.
>
> 2) Shall we add aliases for the timestamp types as part of this FLIP? I
> see Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ [1].
> I think the discussion was quite cumbersome with the full string of
> `TIMESTAMP WITH LOCAL TIME ZONE`. With this FLIP we are making this type
> even more prominent. And important concepts should have a short name
> because they are used frequently. According to the FLIP, we are
> introducing the abbriviation already in function names like
> `TO_TIMESTAMP_LTZ`. `TIMESTAMP_LTZ` could be treated similar to `STRING`
> for `VARCHAR(MAX_INT)`, the serializable string representation would not
> change.
>
> 3) I'm fine with supporting all conversion classes like
> java.time.LocalDateTime, java.sql.Timestamp that TimestampType supported
>   for LocalZonedTimestampType. But we agree that Instant stays the
> default conversion class right? The default extraction defined in [2]
> will not change, correct?
>
> 4) I would remove the comment "Flink supports TIME-related types with
> precision well", because unfortunately this is still not correct. We
> still have issues with TIME(9), it would be great if someone can finally
> fix that though. Maybe the implementation of this FLIP would be a good
> time to fix this issue.
>
> Regards,
> Timo
>
>
> [1]
>
> https://docs.snowflake.com/en/sql-reference/data-types-datetime.html#timestamp-ltz-timestamp-ntz-timestamp-tz
>
> [2]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/types/utils/ClassDataTypeConverter.java
>
> On 28.01.21 03:18, Jark Wu wrote:
> > Thanks Leonard for the further investigation.
> >
> > I think we all agree we should correct the return value of
> > CURRENT_TIMESTAMP.
> > Regarding the return type of CURRENT_TIMESTAMP, I also agree
> TIMESTAMP_LTZ
> > would be more worldwide useful. This may need more effort, but if this is
> > the right direction, we should do it.
> >
> > Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns
> >   TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ.
> > Otherwise, CURRENT_TIME will be quite special and strange.
> > Thus I think it has to return TIME type. Given that we already have
> > CURRENT_DATE which returns
> >   DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT TIME
> ZONE
> > for CURRENT_TIME.
> >
> > In a word, the updated FLIP looks good to me. I especially like the
> > proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]).
> > This will be very convenient to define rowtime on a long value which is a
> > very common case and has been complained a lot in mailing list.
> >
> >
> > Best,
> > Jark
> >
> >
> >
> >
> >
> > On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com> wrote:
> >
> >> Thanks Leonard for the detailed response and also the bad case about
> option
> >> 1, these all
> >> make sense to me.
> >>
> >> Also nice catch about conversion support of LocalZonedTimestampType, I
> >> think it actually
> >> makes sense to support java.sql.Timestamp as well as
> >> java.time.LocalDateTime. It also has
> >> a slight benefit that we might have a chance to run the udf which took
> them
> >> as input parameter
> >> after we change the return type.
> >>
> >> Regarding to the return type of CURRENT_TIME, I also think timezone
> >> information is not useful.
> >> To not expand this FLIP further, I'm lean to keep it as it is.
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com> wrote:
> >>
> >>> Hi, All
> >>>
> >>>   Thanks for your comments. I think all of the thread have agreed that:
> >>> (1) The return values of
> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME()
> >>> are wrong.
> >>> (2) The LOCALTIME/LOCALTIMESTAMP and CURRENT_TIME/CURRENT_TIMESTAMP
> >> should
> >>> be different whether from SQL standard’s perspective or mature systems.
> >>> (3) The semantics of three TIMESTAMP types in Flink SQL follows the SQL
> >>> standard and also keeps the same with other 'good' vendors.
> >>>      TIMESTAMP                                   =>  A literal in
> >>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not contain
> >> timezone
> >>> info, can not represent an absolute time point.
> >>>      TIMESTAMP WITH LOCAL ZONE =>  Records the elapsed time from
> absolute
> >>> time point origin, can represent an absolute time point, requires local
> >>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format.
> >>>      TIMESTAMP WITH TIME ZONE    =>  Consists of time zone info and a
> >>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can represent
> >> an
> >>> absolute time point.
> >>>
> >>>
> >>> Currently we've two ways to correct
> >>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME().
> >>>
> >>> option (1): As the FLIP proposed, change the return value  from UTC
> >>> timezone to local timezone.
> >>>          Pros:   (1) The change looks smaller to users and developers
> (2)
> >>> There're many SQL engines adopted this way
> >>>          Cons:  (1) connector devs may confuse the underlying value of
> >>> TimestampData which needs to change according to data type  (2) I
> thought
> >>> about this weekend. Unfortunately I found a bad case:
> >>>
> >>> The proposal is fine if we only use it in FLINK SQL world, but we need
> to
> >>> consider the conversion between Table/DataStream, assume a record
> >> produced
> >>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'  and the Flink
> SQL
> >>> processes the data with session time zone 'UTC+8', if the sql program
> >> need
> >>> to convert the Table to DataStream, then we need to calculate the
> >> timestamp
> >>> in StreamRecord with session time zone (UTC+8), then we will get 44 in
> >>> DataStream program, but it is wrong because the expected value should
> be
> >> (8
> >>> * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in
> >> Flink
> >>> are based on UTC+0, when correct the PROCTIME() function, the better
> way
> >> is
> >>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with
> >> time
> >>> based on UTC+0 and can be expressed with  local timezone.
> >>>
> >>> option (2) : As we considered in the FLIP as well as @Timo suggested,
> >>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the expressed
> >>> value depends on the local time zone.
> >>>          Pros: (1) Make Flink SQL more close to SQL standard  (2) Can
> deal
> >>> the conversion between Table/DataStream well
> >>>          Cons: (1) We need to discuss the return value/type of
> >> CURRENT_TIME
> >>> function (2) The change is bigger to users, we need to support
> TIMESTAMP
> >>> WITH LOCAL TIME ZONE in connectors/formats as well as custom
> connectors.
> >>>                     (3)The TIMESTAMP WITH LOCAL TIME ZONE support is
> weak
> >>> in Flink, thus we need some improvement,but the workload does not
> matter
> >>> as long as we are doing the right thing ^_^
> >>>
> >>> Due to the above bad case for option (1). I think option 2 should be
> >>> adopted,
> >>> But we also need to consider some problems:
> >>> (1) More conversion classes like LocalDateTime, sql.Timestamp should be
> >>> supported for LocalZonedTimestampType to resolve the UDF compatibility
> >> issue
> >>> (2) The timezone offset for window size of one day should still be
> >>> considered
> >>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL TIME
> ZONE
> >>> well and we also should record in document
> >>> I’ll update these sections of FLIP-162.
> >>>
> >>>
> >>>
> >>> We also need to discuss the CURRENT_TIME function. I know the standard
> >> way
> >>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME ZONE), but
> >> we
> >>> don't support this type yet and I don't see strong motivation to
> support
> >> it
> >>> so far.
> >>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not represent an
> >>> absolute time point which should be considered as a string consisting
> of
> >> a
> >>> time with 'HH:mm:ss' format and time zone info.  We have several
> options
> >>> for this:
> >>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all Flink SQL
> >>> functions follow the standard well,  in this way, we need to offer some
> >>> guidance for user upgrading Flink versions.
> >>> (2) We can also support it from a user's perspective who has used
> >>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also returns
> >>> TIME type.
> >>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to
> >>> CURRENT_TIMESTAMP as Calcite did.
> >>>
> >>> I can image (1) which we don't want to left a bad smell in Flink SQL,
> >> and
> >>> I also accept (2) because I think users do not consider time zone
> issues
> >>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info in time
> is
> >>> not very useful.
> >>>
> >>> I don’t have a strong opinion  for them.  What do others think?
> >>>
> >>>
> >>> I hope I've addressed your concerns. @Timo @Kurt
> >>>
> >>> Best,
> >>> Leonard
> >>>
> >>>
> >>>
> >>>> Most of the mature systems have a clear difference between
> >>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or Hive as
> a
> >>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME ZONE. As
> I
> >>> mentioned in the last comment, I could also imagine this behavior for
> >>> Flink. But in any case, there should be some time zone information
> >>> considered in order to cast to all other types.
> >>>>
> >>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >>> standard, but
> >>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >>> functions which
> >>>>>>> SQL standard supported and introducing a replacement which SQL
> >>> standard not
> >>>>>>> reminded.
> >>>>
> >>>> We can still add those functions in the future. But since we don't
> >> offer
> >>> a TIME WITH TIME ZONE, it is better to not support this function at all
> >> for
> >>> now. And by the way, this is exactly the behavior that also Microsoft
> SQL
> >>> Server does: it also just supports CURRENT_TIMESTAMP (but it returns
> >>> TIMESTAMP without a zone which completes the confusion).
> >>>>
> >>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME
> >>> has
> >>>>>>> more clear semantics, but I realized that user didn’t care the type
> >>> but
> >>>>>>> more about the expressed value they saw, and change the type from
> >>> TIMESTAMP
> >>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need
> >>>>>>> consider all places where the TIMESTAMP type used
> >>>>
> >>>>  From a UDF perspective, I think nothing will change. The new type
> >> system
> >>> and type inference were designed to support all these cases. There is a
> >>> reason why Java has adopted Joda time, because it is hard to come up
> >> with a
> >>> good time library. That's why also we and the other Hadoop ecosystem
> >> folks
> >>> have decided for 3 different kinds of LocalDateTime, ZonedDateTime, and
> >>> Instance. It makes the library more complex, but time is a complex
> topic.
> >>>>
> >>>> I also doubt that many users work with only one time zone. Take the US
> >>> as an example, a country with 3 different timezones. Somebody working
> >> with
> >>> US data cannot properly see the data points with just LOCAL TIME ZONE.
> >> But
> >>> on the other hand, a lot of event data is stored using a UTC timestamp.
> >>>>
> >>>>
> >>>>>> Before jumping into technique details, let's take a step back to
> >>> discuss
> >>>>>> user experience.
> >>>>>>
> >>>>>> The first important question is what kind of date and time will
> >> Flink
> >>>>>> display when users call
> >>>>>>    CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
> >>> similar).
> >>>>>>
> >>>>>> Should it always display the date and time in UTC or in the user's
> >>> time
> >>>>>> zone?
> >>>>
> >>>> @Kurt: I think we all agree that the current behavior with just
> showing
> >>> UTC is wrong. Also, we all agree that when calling CURRENT_TIMESTAMP or
> >>> PROCTIME a user would like to see the time in it's current time zone.
> >>>>
> >>>> As you said, "my wall clock time".
> >>>>
> >>>> However, the question is what is the data type of what you "see". If
> >> you
> >>> pass this record on to a different system, operator, or different
> >> cluster,
> >>> should the "my" get lost or materialized into the record?
> >>>>
> >>>> TIMESTAMP -> completely lost and could cause confusion in a different
> >>> system
> >>>>
> >>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct, so you
> >>> can provide a new local time zone
> >>>>
> >>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted
> >>>>
> >>>> Regards,
> >>>> Timo
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On 22.01.21 09:38, Kurt Young wrote:
> >>>>> Forgot one more thing. Continue with displaying in UTC. As a user, if
> >>> Flink
> >>>>> want to display the timestamp
> >>>>> in UTC, why don't we offer something like UTC_TIMESTAMP?
> >>>>> Best,
> >>>>> Kurt
> >>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com> wrote:
> >>>>>> Before jumping into technique details, let's take a step back to
> >>> discuss
> >>>>>> user experience.
> >>>>>>
> >>>>>> The first important question is what kind of date and time will
> Flink
> >>>>>> display when users call
> >>>>>>   CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
> >>> similar).
> >>>>>>
> >>>>>> Should it always display the date and time in UTC or in the user's
> >> time
> >>>>>> zone? I think this part is the
> >>>>>> reason that surprised lots of users. If we forget about the type and
> >>>>>> internal representation of these
> >>>>>> two methods, as a user, my instinct tells me that these two methods
> >>> should
> >>>>>> display my wall clock time.
> >>>>>>
> >>>>>> Display time in UTC? I'm not sure, why I should care about UTC time?
> >> I
> >>>>>> want to get my current timestamp.
> >>>>>> For those users who have never gone abroad, they might not even be
> >>> able to
> >>>>>> realize that this is affected
> >>>>>> by the time zone.
> >>>>>>
> >>>>>> Best,
> >>>>>> Kurt
> >>>>>>
> >>>>>>
> >>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <xbjt...@gmail.com>
> >> wrote:
> >>>>>>
> >>>>>>> Thanks @Timo for the detailed reply, let's go on this topic on this
> >>>>>>> discussion,  I've merged all mails to this discussion.
> >>>>>>>
> >>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>
> >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>
> >>>>>>>>
> >>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>
> >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>
> >>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
> >>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake)
> >>> use a
> >>>>>>> data type with some degree of time zone information encoded. In a
> >>>>>>> globalized world with businesses spanning different regions, I
> think
> >>> we
> >>>>>>> should do this as well. There should be a difference between
> >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
> >>> choose
> >>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>
> >>>>>>>
> >>>>>>> I know that the two series should be different at first glance, but
> >>>>>>> different SQL engines can have their own explanations,for example,
> >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in Snowflake[1]
> >> and
> >>> has
> >>>>>>> no difference, and Spark only supports the later one and doesn’t
> >>> support
> >>>>>>> LOCALTIME/LOCALTIMESTAMP[2].
> >>>>>>>
> >>>>>>>
> >>>>>>>> If we would design this from scatch, I would suggest the
> following:
> >>>>>>>>
> >>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
> >>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>
> >>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
> >> standard,
> >>> but
> >>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
> >> functions
> >>> which
> >>>>>>> SQL standard supported and introducing a replacement which SQL
> >>> standard not
> >>>>>>> reminded.
> >>>>>>>
> >>>>>>>
> >>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
> >>>>>>> materialize all session time information into every record. It it
> >> the
> >>> most
> >>>>>>> generic data type and allows to cast to all other timestamp data
> >>> types.
> >>>>>>> This generic ability can be used for filter predicates as well
> >> either
> >>>>>>> through implicit or explicit casting.
> >>>>>>>
> >>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to
> >> describe
> >>> a
> >>>>>>> time point, but the type TIMESTAMP  can cast to all other timestamp
> >>> data
> >>>>>>> types combining with session time zone as well, and it also can be
> >>> used for
> >>>>>>> filter predicates. For type casting between BIGINT and TIMESTAMP, I
> >>> think
> >>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP() is more
> >>> clear.
> >>>>>>>
> >>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
> >> Both
> >>>>>>> System.currentMillis() and our watermark system work on long
> values.
> >>> Those
> >>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
> >>> calculation
> >>>>>>> should always happen based on UTC.
> >>>>>>>> We discussed it in a different thread, but we should allow
> PROCTIME
> >>>>>>> globally. People need a way to create instances of TIMESTAMP WITH
> >>> LOCAL
> >>>>>>> TIME ZONE. This is not considered in the current design doc.
> >>>>>>>> Many pipelines contain UTC timestamps and thus it should be easy
> to
> >>>>>>> create one.
> >>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work with this
> >>> type
> >>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME ZONE
> >>> accepts all
> >>>>>>> timestamp data types as casting target [1]. We could allow
> TIMESTAMP
> >>> WITH
> >>>>>>> TIME ZONE in the future for ROWTIME.
> >>>>>>>> In any case, windows should simply adapt their behavior to the
> >> passed
> >>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
> >>> defined by
> >>>>>>> considering the current session time zone.
> >>>>>>>
> >>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME
> >>> has
> >>>>>>> more clear semantics, but I realized that user didn’t care the type
> >>> but
> >>>>>>> more about the expressed value they saw, and change the type from
> >>> TIMESTAMP
> >>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need
> >>>>>>> consider all places where the TIMESTAMP type used, and many builtin
> >>>>>>> functions and UDFs doest not support  TIMESTAMP WITH LOCAL TIME
> ZONE
> >>> type.
> >>>>>>> That means both user and Flink devs need to refactor the code(UDF,
> >>> builtin
> >>>>>>> functions, sql pipeline), to be honest, I didn’t see strong
> >>> motivation that
> >>>>>>> we have to do the pretty big refactor from user’s perspective and
> >>>>>>> developer’s perspective.
> >>>>>>>
> >>>>>>> In one word, both your suggestion and my proposal can resolve
> almost
> >>> all
> >>>>>>> user problems,the divergence is whether we need to spend pretty
> >>> energy just
> >>>>>>> to get a bit more accurate semantics?   I think we need a tradeoff.
> >>>>>>>
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Leonard
> >>>>>>> [1]
> >>>>>>>
> >>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp
> >> <
> >>>>>>>
> >>>
> https://trino.io/docs/current/functions/datetime.html#current_timestamp>
> >>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 <
> >>>>>>> https://issues.apache.org/jira/browse/SPARK-30374>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>   2021-01-22,00:53,Timo Walther <twal...@apache.org> :
> >>>>>>>>
> >>>>>>>> Hi Leonard,
> >>>>>>>>
> >>>>>>>> thanks for working on this topic. I agree that time handling is
> not
> >>>>>>> easy in Flink at the moment. We added new time data types (and some
> >>> are
> >>>>>>> still not supported which even further complicates things like
> >>> TIME(9)). We
> >>>>>>> should definitely improve this situation for users.
> >>>>>>>>
> >>>>>>>> This is a pretty opinionated topic and it seems that the SQL
> >> standard
> >>>>>>> is not really deciding this but is at least supporting. So let me
> >>> express
> >>>>>>> my opinion for the most important functions:
> >>>>>>>>
> >>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> >>>>>>>>
> >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>
> >>>>>>>> I think those are the most obvious ones because the LOCAL
> indicates
> >>>>>>> that the locality should be materialized into the result and any
> >> time
> >>> zone
> >>>>>>> information (coming from session config or data) is not important
> >>>>>>> afterwards.
> >>>>>>>>
> >>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> >>>>>>>>
> >>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
> >>>>>>>>
> >>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
> >>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake)
> >>> use a
> >>>>>>> data type with some degree of time zone information encoded. In a
> >>>>>>> globalized world with businesses spanning different regions, I
> think
> >>> we
> >>>>>>> should do this as well. There should be a difference between
> >>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
> >>> choose
> >>>>>>> which behavior they prefer for their pipeline.
> >>>>>>>>
> >>>>>>>> If we would design this from scatch, I would suggest the
> following:
> >>>>>>>>
> >>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
> >>>>>>> LOCALTIME for materialized timestamp parts
> >>>>>>>>
> >>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
> >>>>>>> materialize all session time information into every record. It it
> >> the
> >>> most
> >>>>>>> generic data type and allows to cast to all other timestamp data
> >>> types.
> >>>>>>> This generic ability can be used for filter predicates as well
> >> either
> >>>>>>> through implicit or explicit casting.
> >>>>>>>>
> >>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
> >> Both
> >>>>>>> System.currentMillis() and our watermark system work on long
> values.
> >>> Those
> >>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
> >>> calculation
> >>>>>>> should always happen based on UTC. We discussed it in a different
> >>> thread,
> >>>>>>> but we should allow PROCTIME globally. People need a way to create
> >>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not considered
> >>> in the
> >>>>>>> current design doc. Many pipelines contain UTC timestamps and thus
> >> it
> >>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP and
> >>>>>>> LOCALTIMESTAMP can work with this type because we should remember
> >> that
> >>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data types as
> >>> casting
> >>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the future
> >> for
> >>>>>>> ROWTIME.
> >>>>>>>>
> >>>>>>>> In any case, windows should simply adapt their behavior to the
> >> passed
> >>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
> >>> defined by
> >>>>>>> considering the current session time zone.
> >>>>>>>>
> >>>>>>>> If we would like to design this with less effort required, we
> could
> >>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also for
> >>>>>>> CURRENT_TIMESTAMP.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I will try to involve more people into this discussion.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Timo
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>
> >>>
> >>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>> <
> >>>>>>>
> >>>
> >>
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>   2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> :
> >>>>>>>>> Before the changes, as I am writing this reply, the local time
> >> here
> >>> is
> >>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>>
> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>> CURRENT_DATE,
> >>>>>>> CURRENT_TIME;
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>   CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>>
> >>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >>> CURRENT_DATE,
> >>>>>>> CURRENT_TIME;
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>   CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still
> >> be
> >>>>>>> TIMESTAMP;
> >>>>>>>>
> >>>>>>>> To Kurt, thanks  for the intuitive case, it really clear, you’re
> >>> wright
> >>>>>>> that I want to propose to change the return value of these
> >> functions.
> >>> It’s
> >>>>>>> the most important part of the topic from user's perspective.
> >>>>>>>>
> >>>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>> To Jark,  nice suggestion, I prepared a FLIP for this topic, and
> >> will
> >>>>>>> start the FLIP discussion soon.
> >>>>>>>>
> >>>>>>>>>> If use the default Flink SQL,&nbsp; the window time range of the
> >>>>>>>>>> statistics is incorrect, then the statistical results will
> >>> naturally
> >>>>>>> be
> >>>>>>>>>> incorrect.
> >>>>>>>> To zhisheng, sorry to hear that this problem influenced your
> >>> production
> >>>>>>> jobs,  Could you share your SQL pattern?  we can have more inputs
> >> and
> >>> try
> >>>>>>> to resolve them.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Leonard
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>   2021-01-21,14:19,Jark Wu <imj...@gmail.com> :
> >>>>>>>>
> >>>>>>>> Great examples to understand the problem and the proposed changes,
> >>>>>>> @Kurt!
> >>>>>>>>
> >>>>>>>> Thanks Leonard for investigating this problem.
> >>>>>>>> The time-zone problems around time functions and windows have
> >>> bothered a
> >>>>>>>> lot of users. It's time to fix them!
> >>>>>>>>
> >>>>>>>> The return value changes sound reasonable to me, and keeping the
> >>> return
> >>>>>>>> type unchanged will minimize the surprise to the users.
> >>>>>>>> Besides that, I think it would be better to mention how this
> >> affects
> >>> the
> >>>>>>>> window behaviors, and the interoperability with DataStream.
> >>>>>>>>
> >>>>>>>> I think this definitely deserves a FLIP.
> >>>>>>>>
> >>>>>>>> ====================================================
> >>>>>>>>
> >>>>>>>> Hi zhisheng,
> >>>>>>>>
> >>>>>>>> Do you have examples to illustrate which case will get the wrong
> >>> window
> >>>>>>>> boundaries?
> >>>>>>>> That will help to verify whether the proposed changes can solve
> >> your
> >>>>>>>> problem.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jark
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> :
> >>>>>>>>
> >>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At present,
> >>>>>>> there are many Flink jobs in our production environment that are
> >> used
> >>> to
> >>>>>>> count day-level reports (eg: count PV/UV ).&nbsp;
> >>>>>>>>
> >>>>>>>> If use the default Flink SQL,&nbsp; the window time range of the
> >>>>>>> statistics is incorrect, then the statistical results will
> naturally
> >>> be
> >>>>>>> incorrect.&nbsp;
> >>>>>>>>
> >>>>>>>> The user needs to deal with the time zone manually in order to
> >> solve
> >>>>>>> the problem.&nbsp;
> >>>>>>>>
> >>>>>>>> If Flink itself can solve these time zone issues, then I think it
> >>> will
> >>>>>>> be user-friendly.
> >>>>>>>>
> >>>>>>>> Thank you
> >>>>>>>>
> >>>>>>>> Best!;
> >>>>>>>> zhisheng
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>>   2021-01-21,12:11,Kurt Young <ykt...@gmail.com> :
> >>>>>>>>
> >>>>>>>> cc this to user & user-zh mailing list because this will affect
> >> lots
> >>> of
> >>>>>>> users, and also quite a lot of users
> >>>>>>>> were asking questions around this topic.
> >>>>>>>>
> >>>>>>>> Let me try to understand this from user's perspective.
> >>>>>>>>
> >>>>>>>> Your proposal will affect five functions, which are:
> >>>>>>>> PROCTIME()
> >>>>>>>> NOW()
> >>>>>>>> CURRENT_DATE
> >>>>>>>> CURRENT_TIME
> >>>>>>>> CURRENT_TIMESTAMP
> >>>>>>>> Before the changes, as I am writing this reply, the local time
> here
> >>> is
> >>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> >>>>>>>> And I tried these 5 functions in sql client, and got:
> >>>>>>>>
> >>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >> CURRENT_DATE,
> >>>>>>> CURRENT_TIME;
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>   CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> >>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> After the changes, the expected behavior will change to:
> >>>>>>>>
> >>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
> >> CURRENT_DATE,
> >>>>>>> CURRENT_TIME;
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> >>>>>>>   CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> >>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
> >>>>>>>>
> >>>>>>>
> >>>
> >>
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> >>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still
> be
> >>>>>>> TIMESTAMP;
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Kurt
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Reply via email to