Thanks all for sharing your opinions.  

Looks like  we’ve reached a consensus about the topic.

@Timo:
> 1) Are we on the same page that LOCALTIMESTAMP returns TIMESTAMP and not 
> TIMESTAMP_LTZ? Maybe we should quickly list also LOCALTIME/LOCALDATE and 
> LOCALTIMESTAMP for completeness.
Yes, LOCALTIMESTAMP returns TIMESTAMP, LOCALTIME returns TIME, the behavior of 
them is clear so I just listed them in the excel[1] of this FLIP references. 

> 2) Shall we add aliases for the timestamp types as part of this FLIP? I see 
> Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ , TIMESTAMP_TZ [1]. I think 
> the discussion was quite cumbersome with the full string of `TIMESTAMP WITH 
> LOCAL TIME ZONE`. With this FLIP we are making this type even more prominent. 
> And important concepts should have a short name because they are used 
> frequently. According to the FLIP, we are introducing the abbriviation 
> already in function names like `TO_TIMESTAMP_LTZ`. `TIMESTAMP_LTZ` could be 
> treated similar to `STRING` for `VARCHAR(MAX_INT)`, the serializable string 
> representation would not change.

@Timo @Jark
Nice idea, I also suffered from the long name during the discussions, the 
abbreviation will not only help us, but also makes it more convenient for 
users. I list the abbreviation name mapping to support:
TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP_NTZ   (which synonyms 
TIMESTAMP)
TIMESTAMP WITH LOCAL TIME ZONE    <=> TIMESTAMP_LTZ  
TIMESTAMP WITH TIME ZONE                 <=> TIMESTAMP_TZ     (supports them in 
the future)
> 3) I'm fine with supporting all conversion classes like 
> java.time.LocalDateTime, java.sql.Timestamp that TimestampType supported  for 
> LocalZonedTimestampType. But we agree that Instant stays the default 
> conversion class right? The default extraction defined in [2] will not 
> change, correct?
Yes, Instant stays the default conversion class. The default

> 4) I would remove the comment "Flink supports TIME-related types with 
> precision well", because unfortunately this is still not correct. We still 
> have issues with TIME(9), it would be great if someone can finally fix that 
> though. Maybe the implementation of this FLIP would be a good time to fix 
> this issue.
You’re right, TIME(9) is not supported yet, I'll take account of TIME(9) to the 
scope of this FLIP.


I’ve updated this FLIP[2] according your suggestions @Jark @Timo
I’ll start the vote soon if there’re no objections.

Best,
Leonard

[1] 
https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
 
<https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing>
[2] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
 
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior>
 

> 
> On 28.01.21 03:18, Jark Wu wrote:
>> Thanks Leonard for the further investigation.
>> I think we all agree we should correct the return value of
>> CURRENT_TIMESTAMP.
>> Regarding the return type of CURRENT_TIMESTAMP, I also agree TIMESTAMP_LTZ
>> would be more worldwide useful. This may need more effort, but if this is
>> the right direction, we should do it.
>> Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns
>>  TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return TIME_TZ.
>> Otherwise, CURRENT_TIME will be quite special and strange.
>> Thus I think it has to return TIME type. Given that we already have
>> CURRENT_DATE which returns
>>  DATE WITHOUT TIME ZONE, I think it's fine to return TIME WITHOUT TIME ZONE
>> for CURRENT_TIME.
>> In a word, the updated FLIP looks good to me. I especially like the
>> proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]).
>> This will be very convenient to define rowtime on a long value which is a
>> very common case and has been complained a lot in mailing list.
>> Best,
>> Jark
>> On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com> wrote:
>>> Thanks Leonard for the detailed response and also the bad case about option
>>> 1, these all
>>> make sense to me.
>>> 
>>> Also nice catch about conversion support of LocalZonedTimestampType, I
>>> think it actually
>>> makes sense to support java.sql.Timestamp as well as
>>> java.time.LocalDateTime. It also has
>>> a slight benefit that we might have a chance to run the udf which took them
>>> as input parameter
>>> after we change the return type.
>>> 
>>> Regarding to the return type of CURRENT_TIME, I also think timezone
>>> information is not useful.
>>> To not expand this FLIP further, I'm lean to keep it as it is.
>>> 
>>> Best,
>>> Kurt
>>> 
>>> 
>>> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <xbjt...@gmail.com> wrote:
>>> 
>>>> Hi, All
>>>> 
>>>>  Thanks for your comments. I think all of the thread have agreed that:
>>>> (1) The return values of CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME()
>>>> are wrong.
>>>> (2) The LOCALTIME/LOCALTIMESTAMP and CURRENT_TIME/CURRENT_TIMESTAMP
>>> should
>>>> be different whether from SQL standard’s perspective or mature systems.
>>>> (3) The semantics of three TIMESTAMP types in Flink SQL follows the SQL
>>>> standard and also keeps the same with other 'good' vendors.
>>>>     TIMESTAMP                                   =>  A literal in
>>>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not contain
>>> timezone
>>>> info, can not represent an absolute time point.
>>>>     TIMESTAMP WITH LOCAL ZONE =>  Records the elapsed time from absolute
>>>> time point origin, can represent an absolute time point, requires local
>>>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’ format.
>>>>     TIMESTAMP WITH TIME ZONE    =>  Consists of time zone info and a
>>>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time, can represent
>>> an
>>>> absolute time point.
>>>> 
>>>> 
>>>> Currently we've two ways to correct
>>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME().
>>>> 
>>>> option (1): As the FLIP proposed, change the return value  from UTC
>>>> timezone to local timezone.
>>>>         Pros:   (1) The change looks smaller to users and developers  (2)
>>>> There're many SQL engines adopted this way
>>>>         Cons:  (1) connector devs may confuse the underlying value of
>>>> TimestampData which needs to change according to data type  (2) I thought
>>>> about this weekend. Unfortunately I found a bad case:
>>>> 
>>>> The proposal is fine if we only use it in FLINK SQL world, but we need to
>>>> consider the conversion between Table/DataStream, assume a record
>>> produced
>>>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'  and the Flink SQL
>>>> processes the data with session time zone 'UTC+8', if the sql program
>>> need
>>>> to convert the Table to DataStream, then we need to calculate the
>>> timestamp
>>>> in StreamRecord with session time zone (UTC+8), then we will get 44 in
>>>> DataStream program, but it is wrong because the expected value should be
>>> (8
>>>> * 60 * 60 + 44). The corner case tell us that the ROWTIME/PROCTIME in
>>> Flink
>>>> are based on UTC+0, when correct the PROCTIME() function, the better way
>>> is
>>>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same long value with
>>> time
>>>> based on UTC+0 and can be expressed with  local timezone.
>>>> 
>>>> option (2) : As we considered in the FLIP as well as @Timo suggested,
>>>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE, the expressed
>>>> value depends on the local time zone.
>>>>         Pros: (1) Make Flink SQL more close to SQL standard  (2) Can deal
>>>> the conversion between Table/DataStream well
>>>>         Cons: (1) We need to discuss the return value/type of
>>> CURRENT_TIME
>>>> function (2) The change is bigger to users, we need to support TIMESTAMP
>>>> WITH LOCAL TIME ZONE in connectors/formats as well as custom connectors.
>>>>                    (3)The TIMESTAMP WITH LOCAL TIME ZONE support is weak
>>>> in Flink, thus we need some improvement,but the workload does not matter
>>>> as long as we are doing the right thing ^_^
>>>> 
>>>> Due to the above bad case for option (1). I think option 2 should be
>>>> adopted,
>>>> But we also need to consider some problems:
>>>> (1) More conversion classes like LocalDateTime, sql.Timestamp should be
>>>> supported for LocalZonedTimestampType to resolve the UDF compatibility
>>> issue
>>>> (2) The timezone offset for window size of one day should still be
>>>> considered
>>>> (3) All connectors/formats should supports TIMESTAMP WITH LOCAL TIME ZONE
>>>> well and we also should record in document
>>>> I’ll update these sections of FLIP-162.
>>>> 
>>>> 
>>>> 
>>>> We also need to discuss the CURRENT_TIME function. I know the standard
>>> way
>>>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL TIME ZONE), but
>>> we
>>>> don't support this type yet and I don't see strong motivation to support
>>> it
>>>> so far.
>>>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not represent an
>>>> absolute time point which should be considered as a string consisting of
>>> a
>>>> time with 'HH:mm:ss' format and time zone info.  We have several  options
>>>> for this:
>>>> (1) We can forbid CURRENT_TIME as @Timo proposed to make all Flink SQL
>>>> functions follow the standard well,  in this way, we need to offer some
>>>> guidance for user upgrading Flink versions.
>>>> (2) We can also support it from a user's perspective who has used
>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake also returns
>>>> TIME type.
>>>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it equal to
>>>> CURRENT_TIMESTAMP as Calcite did.
>>>> 
>>>> I can image (1) which we don't want to left a bad smell in Flink SQL,
>>> and
>>>> I also accept (2) because I think users do not consider time zone issues
>>>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone info in time is
>>>> not very useful.
>>>> 
>>>> I don’t have a strong opinion  for them.  What do others think?
>>>> 
>>>> 
>>>> I hope I've addressed your concerns. @Timo @Kurt
>>>> 
>>>> Best,
>>>> Leonard
>>>> 
>>>> 
>>>> 
>>>>> Most of the mature systems have a clear difference between
>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take Spark or Hive as a
>>>> good example. Snowflake decided for TIMESTAMP WITH LOCAL TIME ZONE. As I
>>>> mentioned in the last comment, I could also imagine this behavior for
>>>> Flink. But in any case, there should be some time zone information
>>>> considered in order to cast to all other types.
>>>>> 
>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
>>>> standard, but
>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
>>>> functions which
>>>>>>>> SQL standard supported and introducing a replacement which SQL
>>>> standard not
>>>>>>>> reminded.
>>>>> 
>>>>> We can still add those functions in the future. But since we don't
>>> offer
>>>> a TIME WITH TIME ZONE, it is better to not support this function at all
>>> for
>>>> now. And by the way, this is exactly the behavior that also Microsoft SQL
>>>> Server does: it also just supports CURRENT_TIMESTAMP (but it returns
>>>> TIMESTAMP without a zone which completes the confusion).
>>>>> 
>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME
>>>> has
>>>>>>>> more clear semantics, but I realized that user didn’t care the type
>>>> but
>>>>>>>> more about the expressed value they saw, and change the type from
>>>> TIMESTAMP
>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need
>>>>>>>> consider all places where the TIMESTAMP type used
>>>>> 
>>>>> From a UDF perspective, I think nothing will change. The new type
>>> system
>>>> and type inference were designed to support all these cases. There is a
>>>> reason why Java has adopted Joda time, because it is hard to come up
>>> with a
>>>> good time library. That's why also we and the other Hadoop ecosystem
>>> folks
>>>> have decided for 3 different kinds of LocalDateTime, ZonedDateTime, and
>>>> Instance. It makes the library more complex, but time is a complex topic.
>>>>> 
>>>>> I also doubt that many users work with only one time zone. Take the US
>>>> as an example, a country with 3 different timezones. Somebody working
>>> with
>>>> US data cannot properly see the data points with just LOCAL TIME ZONE.
>>> But
>>>> on the other hand, a lot of event data is stored using a UTC timestamp.
>>>>> 
>>>>> 
>>>>>>> Before jumping into technique details, let's take a step back to
>>>> discuss
>>>>>>> user experience.
>>>>>>> 
>>>>>>> The first important question is what kind of date and time will
>>> Flink
>>>>>>> display when users call
>>>>>>>   CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
>>>> similar).
>>>>>>> 
>>>>>>> Should it always display the date and time in UTC or in the user's
>>>> time
>>>>>>> zone?
>>>>> 
>>>>> @Kurt: I think we all agree that the current behavior with just showing
>>>> UTC is wrong. Also, we all agree that when calling CURRENT_TIMESTAMP or
>>>> PROCTIME a user would like to see the time in it's current time zone.
>>>>> 
>>>>> As you said, "my wall clock time".
>>>>> 
>>>>> However, the question is what is the data type of what you "see". If
>>> you
>>>> pass this record on to a different system, operator, or different
>>> cluster,
>>>> should the "my" get lost or materialized into the record?
>>>>> 
>>>>> TIMESTAMP -> completely lost and could cause confusion in a different
>>>> system
>>>>> 
>>>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is correct, so you
>>>> can provide a new local time zone
>>>>> 
>>>>> TIMESTAMP WITH TIME ZONE -> also "your" location is persisted
>>>>> 
>>>>> Regards,
>>>>> Timo
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> On 22.01.21 09:38, Kurt Young wrote:
>>>>>> Forgot one more thing. Continue with displaying in UTC. As a user, if
>>>> Flink
>>>>>> want to display the timestamp
>>>>>> in UTC, why don't we offer something like UTC_TIMESTAMP?
>>>>>> Best,
>>>>>> Kurt
>>>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <ykt...@gmail.com> wrote:
>>>>>>> Before jumping into technique details, let's take a step back to
>>>> discuss
>>>>>>> user experience.
>>>>>>> 
>>>>>>> The first important question is what kind of date and time will Flink
>>>>>>> display when users call
>>>>>>>  CURRENT_TIMESTAMP and maybe also PROCTIME (if we think they are
>>>> similar).
>>>>>>> 
>>>>>>> Should it always display the date and time in UTC or in the user's
>>> time
>>>>>>> zone? I think this part is the
>>>>>>> reason that surprised lots of users. If we forget about the type and
>>>>>>> internal representation of these
>>>>>>> two methods, as a user, my instinct tells me that these two methods
>>>> should
>>>>>>> display my wall clock time.
>>>>>>> 
>>>>>>> Display time in UTC? I'm not sure, why I should care about UTC time?
>>> I
>>>>>>> want to get my current timestamp.
>>>>>>> For those users who have never gone abroad, they might not even be
>>>> able to
>>>>>>> realize that this is affected
>>>>>>> by the time zone.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Kurt
>>>>>>> 
>>>>>>> 
>>>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <xbjt...@gmail.com>
>>> wrote:
>>>>>>> 
>>>>>>>> Thanks @Timo for the detailed reply, let's go on this topic on this
>>>>>>>> discussion,  I've merged all mails to this discussion.
>>>>>>>> 
>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
>>>>>>>>> 
>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
>>>>>>>>> 
>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
>>>>>>>>> 
>>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
>>>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake)
>>>> use a
>>>>>>>> data type with some degree of time zone information encoded. In a
>>>>>>>> globalized world with businesses spanning different regions, I think
>>>> we
>>>>>>>> should do this as well. There should be a difference between
>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
>>>> choose
>>>>>>>> which behavior they prefer for their pipeline.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I know that the two series should be different at first glance, but
>>>>>>>> different SQL engines can have their own explanations,for example,
>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in Snowflake[1]
>>> and
>>>> has
>>>>>>>> no difference, and Spark only supports the later one and doesn’t
>>>> support
>>>>>>>> LOCALTIME/LOCALTIMESTAMP[2].
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> If we would design this from scatch, I would suggest the following:
>>>>>>>>> 
>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
>>>>>>>> LOCALTIME for materialized timestamp parts
>>>>>>>> 
>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting in SQL
>>> standard,
>>>> but
>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that dropping
>>> functions
>>>> which
>>>>>>>> SQL standard supported and introducing a replacement which SQL
>>>> standard not
>>>>>>>> reminded.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
>>>>>>>> materialize all session time information into every record. It it
>>> the
>>>> most
>>>>>>>> generic data type and allows to cast to all other timestamp data
>>>> types.
>>>>>>>> This generic ability can be used for filter predicates as well
>>> either
>>>>>>>> through implicit or explicit casting.
>>>>>>>> 
>>>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more information to
>>> describe
>>>> a
>>>>>>>> time point, but the type TIMESTAMP  can cast to all other timestamp
>>>> data
>>>>>>>> types combining with session time zone as well, and it also can be
>>>> used for
>>>>>>>> filter predicates. For type casting between BIGINT and TIMESTAMP, I
>>>> think
>>>>>>>> the function way using TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP() is more
>>>> clear.
>>>>>>>> 
>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
>>> Both
>>>>>>>> System.currentMillis() and our watermark system work on long values.
>>>> Those
>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
>>>> calculation
>>>>>>>> should always happen based on UTC.
>>>>>>>>> We discussed it in a different thread, but we should allow PROCTIME
>>>>>>>> globally. People need a way to create instances of TIMESTAMP WITH
>>>> LOCAL
>>>>>>>> TIME ZONE. This is not considered in the current design doc.
>>>>>>>>> Many pipelines contain UTC timestamps and thus it should be easy to
>>>>>>>> create one.
>>>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can work with this
>>>> type
>>>>>>>> because we should remember that TIMESTAMP WITH LOCAL TIME ZONE
>>>> accepts all
>>>>>>>> timestamp data types as casting target [1]. We could allow TIMESTAMP
>>>> WITH
>>>>>>>> TIME ZONE in the future for ROWTIME.
>>>>>>>>> In any case, windows should simply adapt their behavior to the
>>> passed
>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
>>>> defined by
>>>>>>>> considering the current session time zone.
>>>>>>>> 
>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE for PROCTIME
>>>> has
>>>>>>>> more clear semantics, but I realized that user didn’t care the type
>>>> but
>>>>>>>> more about the expressed value they saw, and change the type from
>>>> TIMESTAMP
>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor that we need
>>>>>>>> consider all places where the TIMESTAMP type used, and many builtin
>>>>>>>> functions and UDFs doest not support  TIMESTAMP WITH LOCAL TIME ZONE
>>>> type.
>>>>>>>> That means both user and Flink devs need to refactor the code(UDF,
>>>> builtin
>>>>>>>> functions, sql pipeline), to be honest, I didn’t see strong
>>>> motivation that
>>>>>>>> we have to do the pretty big refactor from user’s perspective and
>>>>>>>> developer’s perspective.
>>>>>>>> 
>>>>>>>> In one word, both your suggestion and my proposal can resolve almost
>>>> all
>>>>>>>> user problems,the divergence is whether we need to spend pretty
>>>> energy just
>>>>>>>> to get a bit more accurate semantics?   I think we need a tradeoff.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Leonard
>>>>>>>> [1]
>>>>>>>> 
>>>> https://trino.io/docs/current/functions/datetime.html#current_timestamp
>>> <
>>>>>>>> 
>>>> https://trino.io/docs/current/functions/datetime.html#current_timestamp>
>>>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374 <
>>>>>>>> https://issues.apache.org/jira/browse/SPARK-30374>
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>  2021-01-22,00:53,Timo Walther <twal...@apache.org> :
>>>>>>>>> 
>>>>>>>>> Hi Leonard,
>>>>>>>>> 
>>>>>>>>> thanks for working on this topic. I agree that time handling is not
>>>>>>>> easy in Flink at the moment. We added new time data types (and some
>>>> are
>>>>>>>> still not supported which even further complicates things like
>>>> TIME(9)). We
>>>>>>>> should definitely improve this situation for users.
>>>>>>>>> 
>>>>>>>>> This is a pretty opinionated topic and it seems that the SQL
>>> standard
>>>>>>>> is not really deciding this but is at least supporting. So let me
>>>> express
>>>>>>>> my opinion for the most important functions:
>>>>>>>>> 
>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
>>>>>>>>> 
>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
>>>>>>>>> 
>>>>>>>>> I think those are the most obvious ones because the LOCAL indicates
>>>>>>>> that the locality should be materialized into the result and any
>>> time
>>>> zone
>>>>>>>> information (coming from session config or data) is not important
>>>>>>>> afterwards.
>>>>>>>>> 
>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
>>>>>>>>> 
>>>>>>>>> --> uses session time zone, returns DATE/TIME/TIMESTAMP
>>>>>>>>> 
>>>>>>>>> I'm very sceptical about this behavior. Almost all mature systems
>>>>>>>> (Oracle, Postgres) and new high quality systems (Presto, Snowflake)
>>>> use a
>>>>>>>> data type with some degree of time zone information encoded. In a
>>>>>>>> globalized world with businesses spanning different regions, I think
>>>> we
>>>>>>>> should do this as well. There should be a difference between
>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should be able to
>>>> choose
>>>>>>>> which behavior they prefer for their pipeline.
>>>>>>>>> 
>>>>>>>>> If we would design this from scatch, I would suggest the following:
>>>>>>>>> 
>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick LOCALDATE /
>>>>>>>> LOCALTIME for materialized timestamp parts
>>>>>>>>> 
>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH TIME ZONE to
>>>>>>>> materialize all session time information into every record. It it
>>> the
>>>> most
>>>>>>>> generic data type and allows to cast to all other timestamp data
>>>> types.
>>>>>>>> This generic ability can be used for filter predicates as well
>>> either
>>>>>>>> through implicit or explicit casting.
>>>>>>>>> 
>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a long value.
>>> Both
>>>>>>>> System.currentMillis() and our watermark system work on long values.
>>>> Those
>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because the main
>>>> calculation
>>>>>>>> should always happen based on UTC. We discussed it in a different
>>>> thread,
>>>>>>>> but we should allow PROCTIME globally. People need a way to create
>>>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is not considered
>>>> in the
>>>>>>>> current design doc. Many pipelines contain UTC timestamps and thus
>>> it
>>>>>>>> should be easy to create one. Also, both CURRENT_TIMESTAMP and
>>>>>>>> LOCALTIMESTAMP can work with this type because we should remember
>>> that
>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp data types as
>>>> casting
>>>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in the future
>>> for
>>>>>>>> ROWTIME.
>>>>>>>>> 
>>>>>>>>> In any case, windows should simply adapt their behavior to the
>>> passed
>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME ZONE a day is
>>>> defined by
>>>>>>>> considering the current session time zone.
>>>>>>>>> 
>>>>>>>>> If we would like to design this with less effort required, we could
>>>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE also for
>>>>>>>> CURRENT_TIMESTAMP.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I will try to involve more people into this discussion.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> Timo
>>>>>>>>> 
>>>>>>>>> [1]
>>>>>>>> 
>>>> 
>>> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
>>>>>>>> <
>>>>>>>> 
>>>> 
>>> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>  2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> :
>>>>>>>>>> Before the changes, as I am writing this reply, the local time
>>> here
>>>> is
>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
>>>>>>>>>> And I tried these 5 functions in sql client, and got:
>>>>>>>>>> 
>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
>>>> CURRENT_DATE,
>>>>>>>> CURRENT_TIME;
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
>>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> After the changes, the expected behavior will change to:
>>>>>>>>>> 
>>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
>>>> CURRENT_DATE,
>>>>>>>> CURRENT_TIME;
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
>>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
>>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still
>>> be
>>>>>>>> TIMESTAMP;
>>>>>>>>> 
>>>>>>>>> To Kurt, thanks  for the intuitive case, it really clear, you’re
>>>> wright
>>>>>>>> that I want to propose to change the return value of these
>>> functions.
>>>> It’s
>>>>>>>> the most important part of the topic from user's perspective.
>>>>>>>>> 
>>>>>>>>>> I think this definitely deserves a FLIP.
>>>>>>>>> To Jark,  nice suggestion, I prepared a FLIP for this topic, and
>>> will
>>>>>>>> start the FLIP discussion soon.
>>>>>>>>> 
>>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time range of the
>>>>>>>>>>> statistics is incorrect, then the statistical results will
>>>> naturally
>>>>>>>> be
>>>>>>>>>>> incorrect.
>>>>>>>>> To zhisheng, sorry to hear that this problem influenced your
>>>> production
>>>>>>>> jobs,  Could you share your SQL pattern?  we can have more inputs
>>> and
>>>> try
>>>>>>>> to resolve them.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Leonard
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>  2021-01-21,14:19,Jark Wu <imj...@gmail.com> :
>>>>>>>>> 
>>>>>>>>> Great examples to understand the problem and the proposed changes,
>>>>>>>> @Kurt!
>>>>>>>>> 
>>>>>>>>> Thanks Leonard for investigating this problem.
>>>>>>>>> The time-zone problems around time functions and windows have
>>>> bothered a
>>>>>>>>> lot of users. It's time to fix them!
>>>>>>>>> 
>>>>>>>>> The return value changes sound reasonable to me, and keeping the
>>>> return
>>>>>>>>> type unchanged will minimize the surprise to the users.
>>>>>>>>> Besides that, I think it would be better to mention how this
>>> affects
>>>> the
>>>>>>>>> window behaviors, and the interoperability with DataStream.
>>>>>>>>> 
>>>>>>>>> I think this definitely deserves a FLIP.
>>>>>>>>> 
>>>>>>>>> ====================================================
>>>>>>>>> 
>>>>>>>>> Hi zhisheng,
>>>>>>>>> 
>>>>>>>>> Do you have examples to illustrate which case will get the wrong
>>>> window
>>>>>>>>> boundaries?
>>>>>>>>> That will help to verify whether the proposed changes can solve
>>> your
>>>>>>>>> problem.
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Jark
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> :
>>>>>>>>> 
>>>>>>>>> Thanks to Leonard Xu for discussing this tricky topic. At present,
>>>>>>>> there are many Flink jobs in our production environment that are
>>> used
>>>> to
>>>>>>>> count day-level reports (eg: count PV/UV ).&nbsp;
>>>>>>>>> 
>>>>>>>>> If use the default Flink SQL,&nbsp; the window time range of the
>>>>>>>> statistics is incorrect, then the statistical results will naturally
>>>> be
>>>>>>>> incorrect.&nbsp;
>>>>>>>>> 
>>>>>>>>> The user needs to deal with the time zone manually in order to
>>> solve
>>>>>>>> the problem.&nbsp;
>>>>>>>>> 
>>>>>>>>> If Flink itself can solve these time zone issues, then I think it
>>>> will
>>>>>>>> be user-friendly.
>>>>>>>>> 
>>>>>>>>> Thank you
>>>>>>>>> 
>>>>>>>>> Best!;
>>>>>>>>> zhisheng
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>>  2021-01-21,12:11,Kurt Young <ykt...@gmail.com> :
>>>>>>>>> 
>>>>>>>>> cc this to user & user-zh mailing list because this will affect
>>> lots
>>>> of
>>>>>>>> users, and also quite a lot of users
>>>>>>>>> were asking questions around this topic.
>>>>>>>>> 
>>>>>>>>> Let me try to understand this from user's perspective.
>>>>>>>>> 
>>>>>>>>> Your proposal will affect five functions, which are:
>>>>>>>>> PROCTIME()
>>>>>>>>> NOW()
>>>>>>>>> CURRENT_DATE
>>>>>>>>> CURRENT_TIME
>>>>>>>>> CURRENT_TIMESTAMP
>>>>>>>>> Before the changes, as I am writing this reply, the local time here
>>>> is
>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
>>>>>>>>> And I tried these 5 functions in sql client, and got:
>>>>>>>>> 
>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
>>> CURRENT_DATE,
>>>>>>>> CURRENT_TIME;
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
>>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228 |
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> After the changes, the expected behavior will change to:
>>>>>>>>> 
>>>>>>>>> Flink SQL> select now(), PROCTIME(), CURRENT_TIMESTAMP,
>>> CURRENT_DATE,
>>>>>>>> CURRENT_TIME;
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
>>>>>>>>  CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228 |
>>>>>>>>> 
>>>>>>>> 
>>>> 
>>> +-------------------------+-------------------------+-------------------------+--------------+--------------+
>>>>>>>>> The return type of now(), proctime() and CURRENT_TIMESTAMP still be
>>>>>>>> TIMESTAMP;
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Kurt
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
> 

Reply via email to