+1 for the default "auto" to the "table.exec.time-function-evaluation".

>From the definition of these functions, in my opinion:
- Batch is the instant execution of all records, which is the meaning of
the word "BATCH", so there is only one time at query-start.
- Stream only executes a single record in a moment, so time is generated by
each record.

On the other hand, we should be more careful about consistency with other
systems.

Best,
Jingsong

On Tue, Feb 2, 2021 at 11:24 AM Jark Wu <imj...@gmail.com> wrote:

> Hi Leonard, Timo,
>
> I just did some investigation and found all the other batch processing
> systems
>  evaluate the time functions at query-start, including Snowflake, Hive,
> Spark, Trino.
> I'm wondering whether the default 'per-record' mode will still be weird for
> batch users.
> I know we proposed the option for batch users to change the behavior.
> However if 90% users need to set this config before submitting batch jobs,
> why not
> use this mode for batch by default? For the other 10% special users, they
> can still
> set the config to per-record before submitting batch jobs. I believe this
> can greatly
> improve the usability for batch cases.
>
> Therefore, what do you think about using "auto" as the default option
> value?
>
> It evaluates time functions per-record in streaming mode and evaluates at
> query start in batch mode.
> I think this can make both streaming users and batch users happy. IIUC, the
> reason why we
> proposing the default "per-record" mode is for the batch streaming
> consistent.
> However, I think time functions are special cases because they are
> naturally non-deterministic.
> Even if streaming jobs and batch jobs all use "per-record" mode, they still
> can't provide consistent
> results. Thus, I think we may need to think more from the users'
> perspective.
>
> Best,
> Jark
>
>
> On Mon, 1 Feb 2021 at 23:06, Timo Walther <twal...@apache.org> wrote:
>
> > Hi Leonard,
> >
> > thanks for considering this issue as well. +1 for the proposed config
> > option. Let's start a voting thread once the FLIP document has been
> > updated if there are no other concerns?
> >
> > Thanks,
> > Timo
> >
> >
> > On 01.02.21 15:07, Leonard Xu wrote:
> > > Hi, all
> > >
> > > I’ve discussed with @Timo @Jark about the time function evaluation
> > further. We reach a consensus that we’d better address the time function
> > evaluation(function value materialization) in this FLIP as well.
> > >
> > > We’re fine with introducing an option
> > table.exec.time-function-evaluation to control the materialize time point
> > of time function value. The time function includes
> > > LOCALTIME
> > > LOCALTIMESTAMP
> > > CURRENT_DATE
> > > CURRENT_TIME
> > > CURRENT_TIMESTAMP
> > > NOW()
> > > The default value of table.exec.time-function-evaluation is
> > 'per-record', which means Flink evaluates the function value per record,
> we
> > recommend users config this option value for their streaming pipe lines.
> > > Another valid option value is ’query-start’, which means Flink
> evaluates
> > the function value at the query start, we recommend users config this
> > option value for their batch pipelines.
> > > In the future, more valid evaluation option value like ‘auto' may be
> > supported if there’re new requirements, e.g: support ‘auto’ option which
> > evaluates time function value per-record in streaming mode and evaluates
> > > time function value at query start in batch mode.
> > >
> > > Alternative1:
> > >       Introduce function like CURRENT_TIMESTAMP2/CURRENT_TIMESTAMP_NOW
> > which evaluates function value at query start. This may confuse users a
> bit
> > that we provide two similar functions but with different return value.
> >
> > >
> > > Alternative2:
> > >         Do not introduce any configuration/function, control the
> > function evaluation by pipeline execution mode. This may produce
> different
> > result when user use their  streaming pipeline sql to run a batch
> > pipeline(e.g backfilling), and user also
> > > can not control these function behavior.
> > >
> > >
> > > How do you think ?
> > >
> > > Thanks,
> > > Leonard
> > >
> > >
> > >> 在 2021年2月1日,18:23,Timo Walther <twal...@apache.org> 写道:
> > >>
> > >> Parts of the FLIP can already be implemented without a completed
> > voting, e.g. there is no doubt that we should support TIME(9).
> > >>
> > >> However, I don't see a benefit of reworking the time functions to
> > rework them again later. If we lock the time on query-start the
> > implementation of the previsouly mentioned functions will be completely
> > different.
> > >>
> > >> Regards,
> > >> Timo
> > >>
> > >>
> > >> On 01.02.21 02:37, Kurt Young wrote:
> > >>> I also prefer to not expand this FLIP further, but we could open a
> > >>> discussion thread
> > >>> right after this FLIP being accepted and start coding & reviewing.
> Make
> > >>> technique
> > >>> discussion and coding more pipelined will improve efficiency.
> > >>> Best,
> > >>> Kurt
> > >>> On Sat, Jan 30, 2021 at 3:47 PM Leonard Xu <xbjt...@gmail.com>
> wrote:
> > >>>> Hi, Timo
> > >>>>
> > >>>>> I do think that this topic must be part of the FLIP as well. Esp.
> if
> > the
> > >>>> FLIP has the title "time function behavior" and this is clearly a
> > >>>> behavioral aspect. We are performing a heavy refactoring of the SQL
> > query
> > >>>> semantics in Flink here which will affect a lot of users. We cannot
> > rework
> > >>>> the time functions a third time after this.
> > >>>>> I checked a couple of other vendors. It seems that they all lock
> the
> > >>>> timestamp when the query is started. And as you said, in this case
> > both
> > >>>> mature (Oracle) and less mature systems (Hive, MySQL) have the same
> > >>>> behavior.
> > >>>>
> > >>>> FLIP-162> “These problems come from the fact that lots of
> time-related
> > >>>> functions like PROCTIME(), NOW(), CURRENT_DATE, CURRENT_TIME and
> > >>>> CURRENT_TIMESTAMP are returning time values based on UTC+0 time
> zone."
> > >>>> The motivation of  FLIP-162 is to correct the wrong time-related
> > function
> > >>>> value which caused by timezone. And after our discussed before, we
> > found
> > >>>> it's related to the function return type compared to SQL standard
> and
> > other
> > >>>> vendors and thus we proposed make the function return type also
> > consistent.
> > >>>> This is the exact meaning of the FLIP  title and that the FLIP plans
> > to do.
> > >>>>
> > >>>> But for the function materialization mechanism, we didn't consider
> > yet as
> > >>>> a part of our plan because we need to fix the timezone and function
> > type
> > >>>> issues no matter we modify the function materialization mechanism in
> > the
> > >>>> future or not.
> > >>>> So I think it's not belong to this FLIP scope.
> > >>>>
> > >>>> It will have been a great work if we can fix current FLIP's 7
> > proposals
> > >>>> well, we don't want to expand the scope again Eps it's not part of
> our
> > >>>> plan.
> > >>>>
> > >>>> What do you think? @Timo
> > >>>>
> > >>>> And what’s others' thoughts?  @Jark @Kurt
> > >>>>
> > >>>> Best,
> > >>>> Leonard
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Flink should not differ. I fear that we have to adopt this behavior
> > as
> > >>>> well to call us standard compliant. Otherwise it will also not be
> > possible
> > >>>> to have Hive compatibility with proper semantics. It could lead to
> > >>>> unintended behavior.
> > >>>>>
> > >>>>> I see two options for this topic:
> > >>>>>
> > >>>>> 1) Clearly distinguish between query-start and processing time
> > >>>>>
> > >>>>> MySQL offers NOW() and SYSDATE() to distinguish the two semantics.
> We
> > >>>> could run all the previously discussed functions that have a meaning
> > in
> > >>>> other systems in query-start time and use a different name for
> > processing
> > >>>> time. `SYS_TIMESTAMP`, `SYS_DATE`, `SYS_TIME`, `SYS_LOCALTIMESTAMP`,
> > >>>> `SYS_LOCALDATE`, `SYS_LOCALTIME`?
> > >>>>>
> > >>>>> 2) Introduce a config option
> > >>>>>
> > >>>>> We are non-compliant by default and allow typical batch behavior if
> > >>>> needed via a config option. But batch/stream unification should not
> > mean
> > >>>> that we disable certain unification aspects by default.
> > >>>>>
> > >>>>> What do you think?
> > >>>>>
> > >>>>> Regards,
> > >>>>> Timo
> > >>>>>
> > >>>>> On 28.01.21 16:51, Leonard Xu wrote:
> > >>>>>> Hi, Timo
> > >>>>>>> I'm sorry that I need to open another discussion thread befoe
> > voting
> > >>>> but I think we should also discuss this in this FLIP before it pops
> > up at a
> > >>>> later stage.
> > >>>>>>>
> > >>>>>>> How do we want our time functions to behave in long running
> > queries?
> > >>>>>> It’s okay to open this thread. Although I don’t want to consider
> the
> > >>>> function value materialization in this FLIP scope,  I could try
> > explain
> > >>>> something.
> > >>>>>>> See also:
> > >>>>>>>
> > >>>>
> >
> https://stackoverflow.com/questions/5522656/sql-now-in-long-running-query
> > >>>>>>>
> > >>>>>>> I think this was never discussed thoroughly. Actually
> > >>>> CURRENT_TIMESTAMP/NOW/LOCALTIMESTAMP should have slightly different
> > >>>> semantics than PROCTIME(). What it is our current behavior? Are we
> > >>>> materializing those time values during planning?
> > >>>>>> Currently CURRENT_TIMESTAMP/NOW/LOCALTIMESTAMP  keeps same
> behavior
> > in
> > >>>> both Batch and Stream world,  the function value is materialized for
> > per
> > >>>> record not the query start(plan phase).
> > >>>>>> For  PROCTIME(), it also keeps same behavior  in both Batch and
> > Stream
> > >>>> world, in fact we just supported PROCTIME() in Batch last week[1].
> > >>>>>> In one word, we keep same semantics/behavior for Batch and Stream.
> > >>>>>>> Esp. long running batch queries might suffer from inconsistencies
> > >>>> here. When a timestamp is produced by one operator using
> > CURRENT_TIMESTAMP
> > >>>> and a different one might filter relating to CURRENT_TIMESTAMP.
> > >>>>>> It’s a good question, and I've found some users have asked
> simillar
> > >>>> questions in user/user-zh mail-list,  given a fact that many Batch
> > systems
> > >>>> like Hive/Presto using the value of query start, but it’s not
> > suitable for
> > >>>> Stream engine, for example user will use CURRENT_TIMESTAMP to define
> > event
> > >>>> time.
> > >>>>>> As a unified Batch/Stream SQL engine, keep same semantics/behavior
> > is
> > >>>> important, and I agree the Batch user case should also be
> considered.
> > >>>>>> But I think this should be discussed in another topic like 'the
> > >>>> unification of Batch/Stream' which is beyond the scope of this FLIP.
> > >>>>>> This FLIP aims to correct the wrong return type/return value of
> > current
> > >>>> time functions.
> > >>>>>> Best,
> > >>>>>> Leonard
> > >>>>>> [1] https://issues.apache.org/jira/browse/FLINK-17868 <
> > >>>> https://issues.apache.org/jira/browse/FLINK-17868> <
> > >>>> https://issues.apache.org/jira/browse/FLINK-17868 <
> > >>>> https://issues.apache.org/jira/browse/FLINK-17868>>
> > >>>>>>> Regards,
> > >>>>>>> Timo
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 28.01.21 13:46, Leonard Xu wrote:
> > >>>>>>>> Hi, Jark
> > >>>>>>>>> I have a minor suggestion:
> > >>>>>>>>> I think we will still suggest users use TIMESTAMP even if we
> have
> > >>>> TIMESTAMP_NTZ. Then it seems
> > >>>>>>>>> introducing TIMESTAMP_NTZ doesn't help much for users, but
> > >>>> introduces more learning costs.
> > >>>>>>>> I think your suggestion makes sense, we should suggest users use
> > >>>> TIMESTAMP for TIMESTAMP WITHOUT TIME ZONE as we did now, updated as
> > >>>> following:
> > >>>>>>>>     original type name :
> > >>>>                        shortcut type name :
> > >>>>>>>> TIMESTAMP / TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP
> > >>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE                            <=>
> > >>>> TIMESTAMP_LTZ
> > >>>>>>>> TIMESTAMP WITH TIME ZONE
> >  <=>
> > >>>> TIMESTAMP_TZ     (supports them in the future)
> > >>>>>>>> Best,
> > >>>>>>>> Leonard
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On Thu, 28 Jan 2021 at 18:52, Leonard Xu <xbjt...@gmail.com
> > <mailto:
> > >>>> xbjt...@gmail.com> <mailto:xbjt...@gmail.com <mailto:
> > xbjt...@gmail.com>>>
> > >>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Thanks all for sharing your opinions.
> > >>>>>>>>>>
> > >>>>>>>>>> Looks like  we’ve reached a consensus about the topic.
> > >>>>>>>>>>
> > >>>>>>>>>> @Timo:
> > >>>>>>>>>>> 1) Are we on the same page that LOCALTIMESTAMP returns
> > TIMESTAMP
> > >>>> and not
> > >>>>>>>>>> TIMESTAMP_LTZ? Maybe we should quickly list also
> > >>>> LOCALTIME/LOCALDATE and
> > >>>>>>>>>> LOCALTIMESTAMP for completeness.
> > >>>>>>>>>> Yes, LOCALTIMESTAMP returns TIMESTAMP, LOCALTIME returns TIME,
> > the
> > >>>>>>>>>> behavior of them is clear so I just listed them in the
> excel[1]
> > of
> > >>>> this
> > >>>>>>>>>> FLIP references.
> > >>>>>>>>>>
> > >>>>>>>>>>> 2) Shall we add aliases for the timestamp types as part of
> this
> > >>>> FLIP? I
> > >>>>>>>>>> see Snowflake supports TIMESTAMP_LTZ , TIMESTAMP_NTZ ,
> > TIMESTAMP_TZ
> > >>>> [1]. I
> > >>>>>>>>>> think the discussion was quite cumbersome with the full string
> > of
> > >>>>>>>>>> `TIMESTAMP WITH LOCAL TIME ZONE`. With this FLIP we are making
> > this
> > >>>> type
> > >>>>>>>>>> even more prominent. And important concepts should have a
> short
> > name
> > >>>>>>>>>> because they are used frequently. According to the FLIP, we
> are
> > >>>> introducing
> > >>>>>>>>>> the abbriviation already in function names like
> > `TO_TIMESTAMP_LTZ`.
> > >>>>>>>>>> `TIMESTAMP_LTZ` could be treated similar to `STRING` for
> > >>>>>>>>>> `VARCHAR(MAX_INT)`, the serializable string representation
> would
> > >>>> not change.
> > >>>>>>>>>>
> > >>>>>>>>>> @Timo @Jark
> > >>>>>>>>>> Nice idea, I also suffered from the long name during the
> > >>>> discussions, the
> > >>>>>>>>>> abbreviation will not only help us, but also makes it more
> > >>>> convenient for
> > >>>>>>>>>> users. I list the abbreviation name mapping to support:
> > >>>>>>>>>> TIMESTAMP WITHOUT TIME ZONE         <=> TIMESTAMP_NTZ   (which
> > >>>> synonyms
> > >>>>>>>>>> TIMESTAMP)
> > >>>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE    <=> TIMESTAMP_LTZ
> > >>>>>>>>>> TIMESTAMP WITH TIME ZONE                 <=> TIMESTAMP_TZ
> > >>>>   (supports
> > >>>>>>>>>> them in the future)
> > >>>>>>>>>>> 3) I'm fine with supporting all conversion classes like
> > >>>>>>>>>> java.time.LocalDateTime, java.sql.Timestamp that TimestampType
> > >>>> supported
> > >>>>>>>>>> for LocalZonedTimestampType. But we agree that Instant stays
> the
> > >>>> default
> > >>>>>>>>>> conversion class right? The default extraction defined in [2]
> > will
> > >>>> not
> > >>>>>>>>>> change, correct?
> > >>>>>>>>>> Yes, Instant stays the default conversion class. The default
> > >>>>>>>>>>
> > >>>>>>>>>>> 4) I would remove the comment "Flink supports TIME-related
> > types
> > >>>> with
> > >>>>>>>>>> precision well", because unfortunately this is still not
> > correct.
> > >>>> We still
> > >>>>>>>>>> have issues with TIME(9), it would be great if someone can
> > finally
> > >>>> fix that
> > >>>>>>>>>> though. Maybe the implementation of this FLIP would be a good
> > time
> > >>>> to fix
> > >>>>>>>>>> this issue.
> > >>>>>>>>>> You’re right, TIME(9) is not supported yet, I'll take account
> of
> > >>>> TIME(9)
> > >>>>>>>>>> to the scope of this FLIP.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> I’ve updated this FLIP[2] according your suggestions @Jark
> @Timo
> > >>>>>>>>>> I’ll start the vote soon if there’re no objections.
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Leonard
> > >>>>>>>>>>
> > >>>>>>>>>> [1]
> > >>>>>>>>>>
> > >>>>
> >
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> > >>>>>>>>>> <
> > >>>>>>>>>>
> > >>>>
> >
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> > >>>> <
> > >>>>
> >
> https://docs.google.com/spreadsheets/d/1T178krh9xG-WbVpN7mRVJ8bzFnaSJx3l-eg1EWZe_X4/edit?usp=sharing
> > >>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>> [2]
> > >>>>>>>>>>
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
> > >>>> <
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162%3A+Consistent+Flink+SQL+time+function+behavior
> > >>>>>
> > >>>>>>>>>> <
> > >>>>>>>>>>
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior
> > >>>> <
> > >>>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-162:+Consistent+Flink+SQL+time+function+behavior
> > >>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 28.01.21 03:18, Jark Wu wrote:
> > >>>>>>>>>>>> Thanks Leonard for the further investigation.
> > >>>>>>>>>>>> I think we all agree we should correct the return value of
> > >>>>>>>>>>>> CURRENT_TIMESTAMP.
> > >>>>>>>>>>>> Regarding the return type of CURRENT_TIMESTAMP, I also agree
> > >>>>>>>>>> TIMESTAMP_LTZ
> > >>>>>>>>>>>> would be more worldwide useful. This may need more effort,
> > but if
> > >>>> this
> > >>>>>>>>>> is
> > >>>>>>>>>>>> the right direction, we should do it.
> > >>>>>>>>>>>> Regarding the CURRENT_TIME, if CURRENT_TIMESTAMP returns
> > >>>>>>>>>>>> TIMESTAMP_LTZ, then I think CURRENT_TIME shouldn't return
> > TIME_TZ.
> > >>>>>>>>>>>> Otherwise, CURRENT_TIME will be quite special and strange.
> > >>>>>>>>>>>> Thus I think it has to return TIME type. Given that we
> already
> > >>>> have
> > >>>>>>>>>>>> CURRENT_DATE which returns
> > >>>>>>>>>>>> DATE WITHOUT TIME ZONE, I think it's fine to return TIME
> > WITHOUT
> > >>>> TIME
> > >>>>>>>>>> ZONE
> > >>>>>>>>>>>> for CURRENT_TIME.
> > >>>>>>>>>>>> In a word, the updated FLIP looks good to me. I especially
> > like
> > >>>> the
> > >>>>>>>>>>>> proposed new function TO_TIMESTAMP_LTZ(numeric, [,scale]).
> > >>>>>>>>>>>> This will be very convenient to define rowtime on a long
> value
> > >>>> which is
> > >>>>>>>>>> a
> > >>>>>>>>>>>> very common case and has been complained a lot in mailing
> > list.
> > >>>>>>>>>>>> Best,
> > >>>>>>>>>>>> Jark
> > >>>>>>>>>>>> On Mon, 25 Jan 2021 at 21:12, Kurt Young <ykt...@gmail.com>
> > >>>> wrote:
> > >>>>>>>>>>>>> Thanks Leonard for the detailed response and also the bad
> > case
> > >>>> about
> > >>>>>>>>>> option
> > >>>>>>>>>>>>> 1, these all
> > >>>>>>>>>>>>> make sense to me.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Also nice catch about conversion support of
> > >>>> LocalZonedTimestampType, I
> > >>>>>>>>>>>>> think it actually
> > >>>>>>>>>>>>> makes sense to support java.sql.Timestamp as well as
> > >>>>>>>>>>>>> java.time.LocalDateTime. It also has
> > >>>>>>>>>>>>> a slight benefit that we might have a chance to run the udf
> > >>>> which took
> > >>>>>>>>>> them
> > >>>>>>>>>>>>> as input parameter
> > >>>>>>>>>>>>> after we change the return type.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Regarding to the return type of CURRENT_TIME, I also think
> > >>>> timezone
> > >>>>>>>>>>>>> information is not useful.
> > >>>>>>>>>>>>> To not expand this FLIP further, I'm lean to keep it as it
> > is.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>> Kurt
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Mon, Jan 25, 2021 at 8:50 PM Leonard Xu <
> > xbjt...@gmail.com>
> > >>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Hi, All
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Thanks for your comments. I think all of the thread have
> > agreed
> > >>>> that:
> > >>>>>>>>>>>>>> (1) The return values of
> > >>>>>>>>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME()
> > >>>>>>>>>>>>>> are wrong.
> > >>>>>>>>>>>>>> (2) The LOCALTIME/LOCALTIMESTAMP and
> > >>>> CURRENT_TIME/CURRENT_TIMESTAMP
> > >>>>>>>>>>>>> should
> > >>>>>>>>>>>>>> be different whether from SQL standard’s perspective or
> > mature
> > >>>>>>>>>> systems.
> > >>>>>>>>>>>>>> (3) The semantics of three TIMESTAMP types in Flink SQL
> > follows
> > >>>> the
> > >>>>>>>>>> SQL
> > >>>>>>>>>>>>>> standard and also keeps the same with other 'good'
> vendors.
> > >>>>>>>>>>>>>>     TIMESTAMP                                   =>  A
> > literal in
> > >>>>>>>>>>>>>> ‘yyyy-MM-dd HH:mm:ss’ format to describe a time, does not
> > >>>> contain
> > >>>>>>>>>>>>> timezone
> > >>>>>>>>>>>>>> info, can not represent an absolute time point.
> > >>>>>>>>>>>>>>     TIMESTAMP WITH LOCAL ZONE =>  Records the elapsed time
> > from
> > >>>>>>>>>> absolute
> > >>>>>>>>>>>>>> time point origin, can represent an absolute time point,
> > >>>> requires
> > >>>>>>>>>> local
> > >>>>>>>>>>>>>> time zone when expressed with ‘yyyy-MM-dd HH:mm:ss’
> format.
> > >>>>>>>>>>>>>>     TIMESTAMP WITH TIME ZONE    =>  Consists of time zone
> > info
> > >>>> and a
> > >>>>>>>>>>>>>> literal in ‘yyyy-MM-dd HH:mm:ss’ format to describe time,
> > can
> > >>>>>>>>>> represent
> > >>>>>>>>>>>>> an
> > >>>>>>>>>>>>>> absolute time point.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Currently we've two ways to correct
> > >>>>>>>>>>>>>> CURRENT_TIME/CURRENT_TIMESTAMP/NOW()/PROCTIME().
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> option (1): As the FLIP proposed, change the return value
> > from
> > >>>> UTC
> > >>>>>>>>>>>>>> timezone to local timezone.
> > >>>>>>>>>>>>>>         Pros:   (1) The change looks smaller to users and
> > >>>> developers
> > >>>>>>>>>> (2)
> > >>>>>>>>>>>>>> There're many SQL engines adopted this way
> > >>>>>>>>>>>>>>         Cons:  (1) connector devs may confuse the
> underlying
> > >>>> value of
> > >>>>>>>>>>>>>> TimestampData which needs to change according to data type
> > (2)
> > >>>> I
> > >>>>>>>>>> thought
> > >>>>>>>>>>>>>> about this weekend. Unfortunately I found a bad case:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> The proposal is fine if we only use it in FLINK SQL world,
> > but
> > >>>> we
> > >>>>>>>>>> need to
> > >>>>>>>>>>>>>> consider the conversion between Table/DataStream, assume a
> > >>>> record
> > >>>>>>>>>>>>> produced
> > >>>>>>>>>>>>>> in UTC+0 timezone with TIMESTAMP '1970-01-01 08:00:44'
> and
> > the
> > >>>> Flink
> > >>>>>>>>>> SQL
> > >>>>>>>>>>>>>> processes the data with session time zone 'UTC+8', if the
> > sql
> > >>>> program
> > >>>>>>>>>>>>> need
> > >>>>>>>>>>>>>> to convert the Table to DataStream, then we need to
> > calculate
> > >>>> the
> > >>>>>>>>>>>>> timestamp
> > >>>>>>>>>>>>>> in StreamRecord with session time zone (UTC+8), then we
> will
> > >>>> get 44 in
> > >>>>>>>>>>>>>> DataStream program, but it is wrong because the expected
> > value
> > >>>> should
> > >>>>>>>>>> be
> > >>>>>>>>>>>>> (8
> > >>>>>>>>>>>>>> * 60 * 60 + 44). The corner case tell us that the
> > >>>> ROWTIME/PROCTIME in
> > >>>>>>>>>>>>> Flink
> > >>>>>>>>>>>>>> are based on UTC+0, when correct the PROCTIME() function,
> > the
> > >>>> better
> > >>>>>>>>>> way
> > >>>>>>>>>>>>> is
> > >>>>>>>>>>>>>> to use TIMESTAMP WITH LOCAL TIME ZONE which keeps same
> long
> > >>>> value with
> > >>>>>>>>>>>>> time
> > >>>>>>>>>>>>>> based on UTC+0 and can be expressed with  local timezone.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> option (2) : As we considered in the FLIP as well as @Timo
> > >>>> suggested,
> > >>>>>>>>>>>>>> change the return type to TIMESTAMP WITH LOCAL TIME ZONE,
> > the
> > >>>>>>>>>> expressed
> > >>>>>>>>>>>>>> value depends on the local time zone.
> > >>>>>>>>>>>>>>         Pros: (1) Make Flink SQL more close to SQL
> > standard  (2)
> > >>>> Can
> > >>>>>>>>>> deal
> > >>>>>>>>>>>>>> the conversion between Table/DataStream well
> > >>>>>>>>>>>>>>         Cons: (1) We need to discuss the return value/type
> > of
> > >>>>>>>>>>>>> CURRENT_TIME
> > >>>>>>>>>>>>>> function (2) The change is bigger to users, we need to
> > support
> > >>>>>>>>>> TIMESTAMP
> > >>>>>>>>>>>>>> WITH LOCAL TIME ZONE in connectors/formats as well as
> custom
> > >>>>>>>>>> connectors.
> > >>>>>>>>>>>>>>                    (3)The TIMESTAMP WITH LOCAL TIME ZONE
> > support
> > >>>> is
> > >>>>>>>>>> weak
> > >>>>>>>>>>>>>> in Flink, thus we need some improvement,but the workload
> > does
> > >>>> not
> > >>>>>>>>>> matter
> > >>>>>>>>>>>>>> as long as we are doing the right thing ^_^
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Due to the above bad case for option (1). I think option 2
> > >>>> should be
> > >>>>>>>>>>>>>> adopted,
> > >>>>>>>>>>>>>> But we also need to consider some problems:
> > >>>>>>>>>>>>>> (1) More conversion classes like LocalDateTime,
> > sql.Timestamp
> > >>>> should
> > >>>>>>>>>> be
> > >>>>>>>>>>>>>> supported for LocalZonedTimestampType to resolve the UDF
> > >>>> compatibility
> > >>>>>>>>>>>>> issue
> > >>>>>>>>>>>>>> (2) The timezone offset for window size of one day should
> > still
> > >>>> be
> > >>>>>>>>>>>>>> considered
> > >>>>>>>>>>>>>> (3) All connectors/formats should supports TIMESTAMP WITH
> > LOCAL
> > >>>> TIME
> > >>>>>>>>>> ZONE
> > >>>>>>>>>>>>>> well and we also should record in document
> > >>>>>>>>>>>>>> I’ll update these sections of FLIP-162.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> We also need to discuss the CURRENT_TIME function. I know
> > the
> > >>>> standard
> > >>>>>>>>>>>>> way
> > >>>>>>>>>>>>>> is using TIME WITH TIME ZONE(there's no TIME WITH LOCAL
> TIME
> > >>>> ZONE),
> > >>>>>>>>>> but
> > >>>>>>>>>>>>> we
> > >>>>>>>>>>>>>> don't support this type yet and I don't see strong
> > motivation to
> > >>>>>>>>>> support
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>> so far.
> > >>>>>>>>>>>>>> Compared to CURRENT_TIMESTAMP, the CURRENT_TIME can not
> > >>>> represent an
> > >>>>>>>>>>>>>> absolute time point which should be considered as a string
> > >>>> consisting
> > >>>>>>>>>> of
> > >>>>>>>>>>>>> a
> > >>>>>>>>>>>>>> time with 'HH:mm:ss' format and time zone info.  We have
> > several
> > >>>>>>>>>> options
> > >>>>>>>>>>>>>> for this:
> > >>>>>>>>>>>>>> (1) We can forbid CURRENT_TIME as @Timo proposed to make
> all
> > >>>> Flink SQL
> > >>>>>>>>>>>>>> functions follow the standard well,  in this way, we need
> to
> > >>>> offer
> > >>>>>>>>>> some
> > >>>>>>>>>>>>>> guidance for user upgrading Flink versions.
> > >>>>>>>>>>>>>> (2) We can also support it from a user's perspective who
> has
> > >>>> used
> > >>>>>>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP, btw,Snowflake
> > also
> > >>>>>>>>>> returns
> > >>>>>>>>>>>>>> TIME type.
> > >>>>>>>>>>>>>> (3) Returns TIMESTAMP WITH LOCAL TIME ZONE to make it
> equal
> > to
> > >>>>>>>>>>>>>> CURRENT_TIMESTAMP as Calcite did.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I can image (1) which we don't want to left a bad smell in
> > >>>> Flink SQL,
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>> I also accept (2) because I think users do not consider
> time
> > >>>> zone
> > >>>>>>>>>> issues
> > >>>>>>>>>>>>>> when they use CURRENT_DATE/CURRENT_TIME, and the timezone
> > info
> > >>>> in
> > >>>>>>>>>> time is
> > >>>>>>>>>>>>>> not very useful.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I don’t have a strong opinion  for them.  What do others
> > think?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> I hope I've addressed your concerns. @Timo @Kurt
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>> Leonard
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Most of the mature systems have a clear difference
> between
> > >>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. I wouldn't take
> Spark
> > or
> > >>>> Hive
> > >>>>>>>>>> as a
> > >>>>>>>>>>>>>> good example. Snowflake decided for TIMESTAMP WITH LOCAL
> > TIME
> > >>>> ZONE.
> > >>>>>>>>>> As I
> > >>>>>>>>>>>>>> mentioned in the last comment, I could also imagine this
> > >>>> behavior for
> > >>>>>>>>>>>>>> Flink. But in any case, there should be some time zone
> > >>>> information
> > >>>>>>>>>>>>>> considered in order to cast to all other types.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting
> in
> > SQL
> > >>>>>>>>>>>>>> standard, but
> > >>>>>>>>>>>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that
> > dropping
> > >>>>>>>>>>>>>> functions which
> > >>>>>>>>>>>>>>>>>> SQL standard supported and introducing a replacement
> > which
> > >>>> SQL
> > >>>>>>>>>>>>>> standard not
> > >>>>>>>>>>>>>>>>>> reminded.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> We can still add those functions in the future. But since
> > we
> > >>>> don't
> > >>>>>>>>>>>>> offer
> > >>>>>>>>>>>>>> a TIME WITH TIME ZONE, it is better to not support this
> > >>>> function at
> > >>>>>>>>>> all
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>> now. And by the way, this is exactly the behavior that
> also
> > >>>> Microsoft
> > >>>>>>>>>> SQL
> > >>>>>>>>>>>>>> Server does: it also just supports CURRENT_TIMESTAMP (but
> it
> > >>>> returns
> > >>>>>>>>>>>>>> TIMESTAMP without a zone which completes the confusion).
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE
> > for
> > >>>>>>>>>> PROCTIME
> > >>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>> more clear semantics, but I realized that user didn’t
> > care
> > >>>> the
> > >>>>>>>>>> type
> > >>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>> more about the expressed value they saw, and change
> the
> > >>>> type from
> > >>>>>>>>>>>>>> TIMESTAMP
> > >>>>>>>>>>>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor
> > that
> > >>>> we
> > >>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>> consider all places where the TIMESTAMP type used
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>  From a UDF perspective, I think nothing will change. The
> > new
> > >>>> type
> > >>>>>>>>>>>>> system
> > >>>>>>>>>>>>>> and type inference were designed to support all these
> cases.
> > >>>> There is
> > >>>>>>>>>> a
> > >>>>>>>>>>>>>> reason why Java has adopted Joda time, because it is hard
> to
> > >>>> come up
> > >>>>>>>>>>>>> with a
> > >>>>>>>>>>>>>> good time library. That's why also we and the other Hadoop
> > >>>> ecosystem
> > >>>>>>>>>>>>> folks
> > >>>>>>>>>>>>>> have decided for 3 different kinds of LocalDateTime,
> > >>>> ZonedDateTime,
> > >>>>>>>>>> and
> > >>>>>>>>>>>>>> Instance. It makes the library more complex, but time is a
> > >>>> complex
> > >>>>>>>>>> topic.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I also doubt that many users work with only one time
> zone.
> > >>>> Take the
> > >>>>>>>>>> US
> > >>>>>>>>>>>>>> as an example, a country with 3 different timezones.
> > Somebody
> > >>>> working
> > >>>>>>>>>>>>> with
> > >>>>>>>>>>>>>> US data cannot properly see the data points with just
> LOCAL
> > >>>> TIME ZONE.
> > >>>>>>>>>>>>> But
> > >>>>>>>>>>>>>> on the other hand, a lot of event data is stored using a
> UTC
> > >>>>>>>>>> timestamp.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Before jumping into technique details, let's take a
> step
> > >>>> back to
> > >>>>>>>>>>>>>> discuss
> > >>>>>>>>>>>>>>>>> user experience.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> The first important question is what kind of date and
> > time
> > >>>> will
> > >>>>>>>>>>>>> Flink
> > >>>>>>>>>>>>>>>>> display when users call
> > >>>>>>>>>>>>>>>>>   CURRENT_TIMESTAMP and maybe also PROCTIME (if we
> think
> > they
> > >>>> are
> > >>>>>>>>>>>>>> similar).
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Should it always display the date and time in UTC or in
> > the
> > >>>> user's
> > >>>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>> zone?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> @Kurt: I think we all agree that the current behavior
> with
> > just
> > >>>>>>>>>> showing
> > >>>>>>>>>>>>>> UTC is wrong. Also, we all agree that when calling
> > >>>> CURRENT_TIMESTAMP
> > >>>>>>>>>> or
> > >>>>>>>>>>>>>> PROCTIME a user would like to see the time in it's current
> > time
> > >>>> zone.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> As you said, "my wall clock time".
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> However, the question is what is the data type of what
> you
> > >>>> "see". If
> > >>>>>>>>>>>>> you
> > >>>>>>>>>>>>>> pass this record on to a different system, operator, or
> > >>>> different
> > >>>>>>>>>>>>> cluster,
> > >>>>>>>>>>>>>> should the "my" get lost or materialized into the record?
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> TIMESTAMP -> completely lost and could cause confusion
> in a
> > >>>> different
> > >>>>>>>>>>>>>> system
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE -> at least the UTC is
> > correct,
> > >>>> so you
> > >>>>>>>>>>>>>> can provide a new local time zone
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> TIMESTAMP WITH TIME ZONE -> also "your" location is
> > persisted
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Regards,
> > >>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On 22.01.21 09:38, Kurt Young wrote:
> > >>>>>>>>>>>>>>>> Forgot one more thing. Continue with displaying in UTC.
> > As a
> > >>>> user,
> > >>>>>>>>>> if
> > >>>>>>>>>>>>>> Flink
> > >>>>>>>>>>>>>>>> want to display the timestamp
> > >>>>>>>>>>>>>>>> in UTC, why don't we offer something like UTC_TIMESTAMP?
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> Kurt
> > >>>>>>>>>>>>>>>> On Fri, Jan 22, 2021 at 4:33 PM Kurt Young <
> > ykt...@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>> Before jumping into technique details, let's take a
> step
> > >>>> back to
> > >>>>>>>>>>>>>> discuss
> > >>>>>>>>>>>>>>>>> user experience.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> The first important question is what kind of date and
> > time
> > >>>> will
> > >>>>>>>>>> Flink
> > >>>>>>>>>>>>>>>>> display when users call
> > >>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP and maybe also PROCTIME (if we think
> > they
> > >>>> are
> > >>>>>>>>>>>>>> similar).
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Should it always display the date and time in UTC or in
> > the
> > >>>> user's
> > >>>>>>>>>>>>> time
> > >>>>>>>>>>>>>>>>> zone? I think this part is the
> > >>>>>>>>>>>>>>>>> reason that surprised lots of users. If we forget about
> > the
> > >>>> type
> > >>>>>>>>>> and
> > >>>>>>>>>>>>>>>>> internal representation of these
> > >>>>>>>>>>>>>>>>> two methods, as a user, my instinct tells me that these
> > two
> > >>>> methods
> > >>>>>>>>>>>>>> should
> > >>>>>>>>>>>>>>>>> display my wall clock time.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Display time in UTC? I'm not sure, why I should care
> > about
> > >>>> UTC
> > >>>>>>>>>> time?
> > >>>>>>>>>>>>> I
> > >>>>>>>>>>>>>>>>> want to get my current timestamp.
> > >>>>>>>>>>>>>>>>> For those users who have never gone abroad, they might
> > not
> > >>>> even be
> > >>>>>>>>>>>>>> able to
> > >>>>>>>>>>>>>>>>> realize that this is affected
> > >>>>>>>>>>>>>>>>> by the time zone.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>> Kurt
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Fri, Jan 22, 2021 at 12:25 PM Leonard Xu <
> > >>>> xbjt...@gmail.com>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks @Timo for the detailed reply, let's go on this
> > topic
> > >>>> on
> > >>>>>>>>>> this
> > >>>>>>>>>>>>>>>>>> discussion,  I've merged all mails to this discussion.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --> uses session time zone, returns
> DATE/TIME/TIMESTAMP
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --> uses session time zone, returns
> DATE/TIME/TIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I'm very sceptical about this behavior. Almost all
> > mature
> > >>>> systems
> > >>>>>>>>>>>>>>>>>> (Oracle, Postgres) and new high quality systems
> (Presto,
> > >>>>>>>>>> Snowflake)
> > >>>>>>>>>>>>>> use a
> > >>>>>>>>>>>>>>>>>> data type with some degree of time zone information
> > >>>> encoded. In a
> > >>>>>>>>>>>>>>>>>> globalized world with businesses spanning different
> > >>>> regions, I
> > >>>>>>>>>> think
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>> should do this as well. There should be a difference
> > between
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should
> > be
> > >>>> able to
> > >>>>>>>>>>>>>> choose
> > >>>>>>>>>>>>>>>>>> which behavior they prefer for their pipeline.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I know that the two series should be different at
> first
> > >>>> glance,
> > >>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>> different SQL engines can have their own
> > explanations,for
> > >>>> example,
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP are synonyms in
> > >>>> Snowflake[1]
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>> no difference, and Spark only supports the later one
> and
> > >>>> doesn’t
> > >>>>>>>>>>>>>> support
> > >>>>>>>>>>>>>>>>>> LOCALTIME/LOCALTIMESTAMP[2].
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If we would design this from scatch, I would suggest
> > the
> > >>>>>>>>>> following:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick
> > >>>> LOCALDATE /
> > >>>>>>>>>>>>>>>>>> LOCALTIME for materialized timestamp parts
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> The function CURRENT_DATE/CURRENT_TIME is supporting
> in
> > SQL
> > >>>>>>>>>>>>> standard,
> > >>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>> LOCALDATE not, I don’t think it’s a good idea that
> > dropping
> > >>>>>>>>>>>>> functions
> > >>>>>>>>>>>>>> which
> > >>>>>>>>>>>>>>>>>> SQL standard supported and introducing a replacement
> > which
> > >>>> SQL
> > >>>>>>>>>>>>>> standard not
> > >>>>>>>>>>>>>>>>>> reminded.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH
> TIME
> > >>>> ZONE to
> > >>>>>>>>>>>>>>>>>> materialize all session time information into every
> > record.
> > >>>> It it
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>> generic data type and allows to cast to all other
> > timestamp
> > >>>> data
> > >>>>>>>>>>>>>> types.
> > >>>>>>>>>>>>>>>>>> This generic ability can be used for filter predicates
> > as
> > >>>> well
> > >>>>>>>>>>>>> either
> > >>>>>>>>>>>>>>>>>> through implicit or explicit casting.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> TIMESTAMP WITH TIME ZONE indeed contains more
> > information to
> > >>>>>>>>>>>>> describe
> > >>>>>>>>>>>>>> a
> > >>>>>>>>>>>>>>>>>> time point, but the type TIMESTAMP  can cast to all
> > other
> > >>>>>>>>>> timestamp
> > >>>>>>>>>>>>>> data
> > >>>>>>>>>>>>>>>>>> types combining with session time zone as well, and it
> > also
> > >>>> can be
> > >>>>>>>>>>>>>> used for
> > >>>>>>>>>>>>>>>>>> filter predicates. For type casting between BIGINT and
> > >>>> TIMESTAMP,
> > >>>>>>>>>> I
> > >>>>>>>>>>>>>> think
> > >>>>>>>>>>>>>>>>>> the function way using
> > TO_TIMEMTAMP()/FROM_UNIXTIMESTAMP()
> > >>>> is more
> > >>>>>>>>>>>>>> clear.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a
> > long
> > >>>> value.
> > >>>>>>>>>>>>> Both
> > >>>>>>>>>>>>>>>>>> System.currentMillis() and our watermark system work
> on
> > long
> > >>>>>>>>>> values.
> > >>>>>>>>>>>>>> Those
> > >>>>>>>>>>>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because
> the
> > >>>> main
> > >>>>>>>>>>>>>> calculation
> > >>>>>>>>>>>>>>>>>> should always happen based on UTC.
> > >>>>>>>>>>>>>>>>>>> We discussed it in a different thread, but we should
> > allow
> > >>>>>>>>>> PROCTIME
> > >>>>>>>>>>>>>>>>>> globally. People need a way to create instances of
> > >>>> TIMESTAMP WITH
> > >>>>>>>>>>>>>> LOCAL
> > >>>>>>>>>>>>>>>>>> TIME ZONE. This is not considered in the current
> design
> > doc.
> > >>>>>>>>>>>>>>>>>>> Many pipelines contain UTC timestamps and thus it
> > should
> > >>>> be easy
> > >>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> create one.
> > >>>>>>>>>>>>>>>>>>> Also, both CURRENT_TIMESTAMP and LOCALTIMESTAMP can
> > work
> > >>>> with
> > >>>>>>>>>> this
> > >>>>>>>>>>>>>> type
> > >>>>>>>>>>>>>>>>>> because we should remember that TIMESTAMP WITH LOCAL
> > TIME
> > >>>> ZONE
> > >>>>>>>>>>>>>> accepts all
> > >>>>>>>>>>>>>>>>>> timestamp data types as casting target [1]. We could
> > allow
> > >>>>>>>>>> TIMESTAMP
> > >>>>>>>>>>>>>> WITH
> > >>>>>>>>>>>>>>>>>> TIME ZONE in the future for ROWTIME.
> > >>>>>>>>>>>>>>>>>>> In any case, windows should simply adapt their
> > behavior to
> > >>>> the
> > >>>>>>>>>>>>> passed
> > >>>>>>>>>>>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME
> ZONE
> > a
> > >>>> day is
> > >>>>>>>>>>>>>> defined by
> > >>>>>>>>>>>>>>>>>> considering the current session time zone.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> I also agree returning  TIMESTAMP WITH LOCAL TIME ZONE
> > for
> > >>>>>>>>>> PROCTIME
> > >>>>>>>>>>>>>> has
> > >>>>>>>>>>>>>>>>>> more clear semantics, but I realized that user didn’t
> > care
> > >>>> the
> > >>>>>>>>>> type
> > >>>>>>>>>>>>>> but
> > >>>>>>>>>>>>>>>>>> more about the expressed value they saw, and change
> the
> > >>>> type from
> > >>>>>>>>>>>>>> TIMESTAMP
> > >>>>>>>>>>>>>>>>>> to TIMESTAMP WITH LOCAL TIME ZONE brings huge refactor
> > that
> > >>>> we
> > >>>>>>>>>> need
> > >>>>>>>>>>>>>>>>>> consider all places where the TIMESTAMP type used, and
> > many
> > >>>>>>>>>> builtin
> > >>>>>>>>>>>>>>>>>> functions and UDFs doest not support  TIMESTAMP WITH
> > LOCAL
> > >>>> TIME
> > >>>>>>>>>> ZONE
> > >>>>>>>>>>>>>> type.
> > >>>>>>>>>>>>>>>>>> That means both user and Flink devs need to refactor
> the
> > >>>> code(UDF,
> > >>>>>>>>>>>>>> builtin
> > >>>>>>>>>>>>>>>>>> functions, sql pipeline), to be honest, I didn’t see
> > strong
> > >>>>>>>>>>>>>> motivation that
> > >>>>>>>>>>>>>>>>>> we have to do the pretty big refactor from user’s
> > >>>> perspective and
> > >>>>>>>>>>>>>>>>>> developer’s perspective.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> In one word, both your suggestion and my proposal can
> > >>>> resolve
> > >>>>>>>>>> almost
> > >>>>>>>>>>>>>> all
> > >>>>>>>>>>>>>>>>>> user problems,the divergence is whether we need to
> spend
> > >>>> pretty
> > >>>>>>>>>>>>>> energy just
> > >>>>>>>>>>>>>>>>>> to get a bit more accurate semantics?   I think we
> need
> > a
> > >>>>>>>>>> tradeoff.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>> Leonard
> > >>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > https://trino.io/docs/current/functions/datetime.html#current_timestamp
> > >>>>>>>>>>>>> <
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> > https://trino.io/docs/current/functions/datetime.html#current_timestamp>
> > >>>>>>>>>>>>>>>>>> [2] https://issues.apache.org/jira/browse/SPARK-30374
> <
> > >>>>>>>>>>>>>>>>>> https://issues.apache.org/jira/browse/SPARK-30374>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2021-01-22,00:53,Timo Walther <twal...@apache.org> :
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Hi Leonard,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> thanks for working on this topic. I agree that time
> > >>>> handling is
> > >>>>>>>>>> not
> > >>>>>>>>>>>>>>>>>> easy in Flink at the moment. We added new time data
> > types
> > >>>> (and
> > >>>>>>>>>> some
> > >>>>>>>>>>>>>> are
> > >>>>>>>>>>>>>>>>>> still not supported which even further complicates
> > things
> > >>>> like
> > >>>>>>>>>>>>>> TIME(9)). We
> > >>>>>>>>>>>>>>>>>> should definitely improve this situation for users.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> This is a pretty opinionated topic and it seems that
> > the
> > >>>> SQL
> > >>>>>>>>>>>>> standard
> > >>>>>>>>>>>>>>>>>> is not really deciding this but is at least
> supporting.
> > So
> > >>>> let me
> > >>>>>>>>>>>>>> express
> > >>>>>>>>>>>>>>>>>> my opinion for the most important functions:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> LOCALDATE / LOCALTIME / LOCALTIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --> uses session time zone, returns
> DATE/TIME/TIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I think those are the most obvious ones because the
> > LOCAL
> > >>>>>>>>>> indicates
> > >>>>>>>>>>>>>>>>>> that the locality should be materialized into the
> result
> > >>>> and any
> > >>>>>>>>>>>>> time
> > >>>>>>>>>>>>>> zone
> > >>>>>>>>>>>>>>>>>> information (coming from session config or data) is
> not
> > >>>> important
> > >>>>>>>>>>>>>>>>>> afterwards.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> CURRENT_DATE/CURRENT_TIME/CURRENT_TIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> --> uses session time zone, returns
> DATE/TIME/TIMESTAMP
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I'm very sceptical about this behavior. Almost all
> > mature
> > >>>> systems
> > >>>>>>>>>>>>>>>>>> (Oracle, Postgres) and new high quality systems
> (Presto,
> > >>>>>>>>>> Snowflake)
> > >>>>>>>>>>>>>> use a
> > >>>>>>>>>>>>>>>>>> data type with some degree of time zone information
> > >>>> encoded. In a
> > >>>>>>>>>>>>>>>>>> globalized world with businesses spanning different
> > >>>> regions, I
> > >>>>>>>>>> think
> > >>>>>>>>>>>>>> we
> > >>>>>>>>>>>>>>>>>> should do this as well. There should be a difference
> > between
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP and LOCALTIMESTAMP. And users should
> > be
> > >>>> able to
> > >>>>>>>>>>>>>> choose
> > >>>>>>>>>>>>>>>>>> which behavior they prefer for their pipeline.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If we would design this from scatch, I would suggest
> > the
> > >>>>>>>>>> following:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> - drop CURRENT_DATE / CURRENT_TIME and let users pick
> > >>>> LOCALDATE /
> > >>>>>>>>>>>>>>>>>> LOCALTIME for materialized timestamp parts
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> - CURRENT_TIMESTAMP should return a TIMESTAMP WITH
> TIME
> > >>>> ZONE to
> > >>>>>>>>>>>>>>>>>> materialize all session time information into every
> > record.
> > >>>> It it
> > >>>>>>>>>>>>> the
> > >>>>>>>>>>>>>> most
> > >>>>>>>>>>>>>>>>>> generic data type and allows to cast to all other
> > timestamp
> > >>>> data
> > >>>>>>>>>>>>>> types.
> > >>>>>>>>>>>>>>>>>> This generic ability can be used for filter predicates
> > as
> > >>>> well
> > >>>>>>>>>>>>> either
> > >>>>>>>>>>>>>>>>>> through implicit or explicit casting.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> PROCTIME/ROWTIME should be time functions based on a
> > long
> > >>>> value.
> > >>>>>>>>>>>>> Both
> > >>>>>>>>>>>>>>>>>> System.currentMillis() and our watermark system work
> on
> > long
> > >>>>>>>>>> values.
> > >>>>>>>>>>>>>> Those
> > >>>>>>>>>>>>>>>>>> should return TIMESTAMP WITH LOCAL TIME ZONE because
> the
> > >>>> main
> > >>>>>>>>>>>>>> calculation
> > >>>>>>>>>>>>>>>>>> should always happen based on UTC. We discussed it in
> a
> > >>>> different
> > >>>>>>>>>>>>>> thread,
> > >>>>>>>>>>>>>>>>>> but we should allow PROCTIME globally. People need a
> > way to
> > >>>> create
> > >>>>>>>>>>>>>>>>>> instances of TIMESTAMP WITH LOCAL TIME ZONE. This is
> not
> > >>>>>>>>>> considered
> > >>>>>>>>>>>>>> in the
> > >>>>>>>>>>>>>>>>>> current design doc. Many pipelines contain UTC
> > timestamps
> > >>>> and thus
> > >>>>>>>>>>>>> it
> > >>>>>>>>>>>>>>>>>> should be easy to create one. Also, both
> > CURRENT_TIMESTAMP
> > >>>> and
> > >>>>>>>>>>>>>>>>>> LOCALTIMESTAMP can work with this type because we
> should
> > >>>> remember
> > >>>>>>>>>>>>> that
> > >>>>>>>>>>>>>>>>>> TIMESTAMP WITH LOCAL TIME ZONE accepts all timestamp
> > data
> > >>>> types as
> > >>>>>>>>>>>>>> casting
> > >>>>>>>>>>>>>>>>>> target [1]. We could allow TIMESTAMP WITH TIME ZONE in
> > the
> > >>>> future
> > >>>>>>>>>>>>> for
> > >>>>>>>>>>>>>>>>>> ROWTIME.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> In any case, windows should simply adapt their
> > behavior to
> > >>>> the
> > >>>>>>>>>>>>> passed
> > >>>>>>>>>>>>>>>>>> timestamp type. And with TIMESTAMP WITH LOCAL TIME
> ZONE
> > a
> > >>>> day is
> > >>>>>>>>>>>>>> defined by
> > >>>>>>>>>>>>>>>>>> considering the current session time zone.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If we would like to design this with less effort
> > required,
> > >>>> we
> > >>>>>>>>>> could
> > >>>>>>>>>>>>>>>>>> think about returning TIMESTAMP WITH LOCAL TIME ZONE
> > also
> > >>>> for
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I will try to involve more people into this
> discussion.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks,
> > >>>>>>>>>>>>>>>>>>> Timo
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> > >>>>>>>>>>>>>>>>>> <
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> https://docs.oracle.com/en/database/oracle/oracle-database/21/sqlrf/Data-Types.html#GUID-E7CA339A-2093-4FE4-A36E-1D09593591D3
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2021-01-21,22:32,Leonard Xu <xbjt...@gmail.com> :
> > >>>>>>>>>>>>>>>>>>>> Before the changes, as I am writing this reply, the
> > local
> > >>>> time
> > >>>>>>>>>>>>> here
> > >>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> > >>>>>>>>>>>>>>>>>>>> And I tried these 5 functions in sql client, and
> got:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(),
> > CURRENT_TIMESTAMP,
> > >>>>>>>>>>>>>> CURRENT_DATE,
> > >>>>>>>>>>>>>>>>>> CURRENT_TIME;
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1
> |
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228
> |
> > >>>>>>>>>>>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228
> |
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> After the changes, the expected behavior will change
> > to:
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(),
> > CURRENT_TIMESTAMP,
> > >>>>>>>>>>>>>> CURRENT_DATE,
> > >>>>>>>>>>>>>>>>>> CURRENT_TIME;
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1
> |
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228
> |
> > >>>>>>>>>>>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228
> |
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>>> The return type of now(), proctime() and
> > >>>> CURRENT_TIMESTAMP still
> > >>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>> TIMESTAMP;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> To Kurt, thanks  for the intuitive case, it really
> > clear,
> > >>>> you’re
> > >>>>>>>>>>>>>> wright
> > >>>>>>>>>>>>>>>>>> that I want to propose to change the return value of
> > these
> > >>>>>>>>>>>>> functions.
> > >>>>>>>>>>>>>> It’s
> > >>>>>>>>>>>>>>>>>> the most important part of the topic from user's
> > >>>> perspective.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I think this definitely deserves a FLIP.
> > >>>>>>>>>>>>>>>>>>> To Jark,  nice suggestion, I prepared a FLIP for this
> > >>>> topic, and
> > >>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>>> start the FLIP discussion soon.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time
> > >>>> range of
> > >>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>>>> statistics is incorrect, then the statistical
> results
> > >>>> will
> > >>>>>>>>>>>>>> naturally
> > >>>>>>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>>>>> incorrect.
> > >>>>>>>>>>>>>>>>>>> To zhisheng, sorry to hear that this problem
> influenced
> > >>>> your
> > >>>>>>>>>>>>>> production
> > >>>>>>>>>>>>>>>>>> jobs,  Could you share your SQL pattern?  we can have
> > more
> > >>>> inputs
> > >>>>>>>>>>>>> and
> > >>>>>>>>>>>>>> try
> > >>>>>>>>>>>>>>>>>> to resolve them.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>> Leonard
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2021-01-21,14:19,Jark Wu <imj...@gmail.com> :
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Great examples to understand the problem and the
> > proposed
> > >>>>>>>>>> changes,
> > >>>>>>>>>>>>>>>>>> @Kurt!
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks Leonard for investigating this problem.
> > >>>>>>>>>>>>>>>>>>> The time-zone problems around time functions and
> > windows
> > >>>> have
> > >>>>>>>>>>>>>> bothered a
> > >>>>>>>>>>>>>>>>>>> lot of users. It's time to fix them!
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> The return value changes sound reasonable to me, and
> > >>>> keeping the
> > >>>>>>>>>>>>>> return
> > >>>>>>>>>>>>>>>>>>> type unchanged will minimize the surprise to the
> users.
> > >>>>>>>>>>>>>>>>>>> Besides that, I think it would be better to mention
> how
> > >>>> this
> > >>>>>>>>>>>>> affects
> > >>>>>>>>>>>>>> the
> > >>>>>>>>>>>>>>>>>>> window behaviors, and the interoperability with
> > DataStream.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> I think this definitely deserves a FLIP.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> ====================================================
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Hi zhisheng,
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Do you have examples to illustrate which case will
> get
> > the
> > >>>> wrong
> > >>>>>>>>>>>>>> window
> > >>>>>>>>>>>>>>>>>>> boundaries?
> > >>>>>>>>>>>>>>>>>>> That will help to verify whether the proposed changes
> > can
> > >>>> solve
> > >>>>>>>>>>>>> your
> > >>>>>>>>>>>>>>>>>>> problem.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>> Jark
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2021-01-21,12:54,zhisheng <173855...@qq.com> :
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks to Leonard Xu for discussing this tricky
> topic.
> > At
> > >>>>>>>>>> present,
> > >>>>>>>>>>>>>>>>>> there are many Flink jobs in our production
> environment
> > >>>> that are
> > >>>>>>>>>>>>> used
> > >>>>>>>>>>>>>> to
> > >>>>>>>>>>>>>>>>>> count day-level reports (eg: count PV/UV ).&nbsp;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If use the default Flink SQL,&nbsp; the window time
> > range
> > >>>> of the
> > >>>>>>>>>>>>>>>>>> statistics is incorrect, then the statistical results
> > will
> > >>>>>>>>>> naturally
> > >>>>>>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>> incorrect.&nbsp;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> The user needs to deal with the time zone manually in
> > >>>> order to
> > >>>>>>>>>>>>> solve
> > >>>>>>>>>>>>>>>>>> the problem.&nbsp;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> If Flink itself can solve these time zone issues,
> then
> > I
> > >>>> think it
> > >>>>>>>>>>>>>> will
> > >>>>>>>>>>>>>>>>>> be user-friendly.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thank you
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best!;
> > >>>>>>>>>>>>>>>>>>> zhisheng
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> 2021-01-21,12:11,Kurt Young <ykt...@gmail.com> :
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> cc this to user & user-zh mailing list because this
> > will
> > >>>> affect
> > >>>>>>>>>>>>> lots
> > >>>>>>>>>>>>>> of
> > >>>>>>>>>>>>>>>>>> users, and also quite a lot of users
> > >>>>>>>>>>>>>>>>>>> were asking questions around this topic.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Let me try to understand this from user's
> perspective.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Your proposal will affect five functions, which are:
> > >>>>>>>>>>>>>>>>>>> PROCTIME()
> > >>>>>>>>>>>>>>>>>>> NOW()
> > >>>>>>>>>>>>>>>>>>> CURRENT_DATE
> > >>>>>>>>>>>>>>>>>>> CURRENT_TIME
> > >>>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP
> > >>>>>>>>>>>>>>>>>>> Before the changes, as I am writing this reply, the
> > local
> > >>>> time
> > >>>>>>>>>> here
> > >>>>>>>>>>>>>> is
> > >>>>>>>>>>>>>>>>>> 2021-01-21 12:03:35 (Beijing time, UTC+8).
> > >>>>>>>>>>>>>>>>>>> And I tried these 5 functions in sql client, and got:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(),
> CURRENT_TIMESTAMP,
> > >>>>>>>>>>>>> CURRENT_DATE,
> > >>>>>>>>>>>>>>>>>> CURRENT_TIME;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> | 2021-01-21T04:03:35.228 | 2021-01-21T04:03:35.228 |
> > >>>>>>>>>>>>>>>>>> 2021-01-21T04:03:35.228 |   2021-01-21 | 04:03:35.228
> |
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> After the changes, the expected behavior will change
> > to:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Flink SQL> select now(), PROCTIME(),
> CURRENT_TIMESTAMP,
> > >>>>>>>>>>>>> CURRENT_DATE,
> > >>>>>>>>>>>>>>>>>> CURRENT_TIME;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> |                  EXPR$0 |                  EXPR$1 |
> > >>>>>>>>>>>>>>>>>> CURRENT_TIMESTAMP | CURRENT_DATE | CURRENT_TIME |
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> | 2021-01-21T12:03:35.228 | 2021-01-21T12:03:35.228 |
> > >>>>>>>>>>>>>>>>>> 2021-01-21T12:03:35.228 |   2021-01-21 | 12:03:35.228
> |
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>
> >
> +-------------------------+-------------------------+-------------------------+--------------+--------------+
> > >>>>>>>>>>>>>>>>>>> The return type of now(), proctime() and
> > CURRENT_TIMESTAMP
> > >>>> still
> > >>>>>>>>>> be
> > >>>>>>>>>>>>>>>>>> TIMESTAMP;
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>> Kurt
> > >>>>
> > >>>>
> > >>
> > >
> > >
> >
> >
>


-- 
Best, Jingsong Lee

Reply via email to