Thanks for confirming! Honestly I support to treat timestamp field as special and restrict modification (the way DataStream API does), but I agree the new approach could be more natural to unify the semantic of SQL for both batch and stream.
Thanks again, Jungtaek Lim (HeartSaVioR) On Tue, Apr 28, 2020 at 2:24 PM Jark Wu <imj...@gmail.com> wrote: > Hi Jungtaek, > > Yes. Your understanding is correct :) > > Best, > Jark > > On Tue, 28 Apr 2020 at 11:58, Jungtaek Lim <kabhwan.opensou...@gmail.com> > wrote: > > > Thanks Kurt and Jark for the detailed explanation! Pretty much helped to > > understand about FLIP-66. > > > > That sounds as Flink won't leverage timestamp in StreamRecord (which is > > hidden and cannot modified easily) and handles the time semantic by the > > input schema for the operation, to unify the semantic between batch and > > stream. Did I understand it correctly? > > > > I'm not familiar with internal of Flink so not easy to consume the > > information in FLINK-11286, but in general I'd be supportive with > defining > > watermark as close as possible from source, as it'll be easier to reason > > about. (I basically refer to timestamp assigner instead of watermark > > assigner though.) > > > > - Jungtaek Lim > > > > On Tue, Apr 28, 2020 at 11:37 AM Jark Wu <imj...@gmail.com> wrote: > > > > > Hi Jungtaek, > > > > > > Kurt has said what I want to say. I will add some background. > > > Flink Table API & SQL only supports to define processing-time attribute > > and > > > event-time attribute (watermark) on source, not support to define a new > > one > > > in query. > > > The time attributes will pass through the query and time-based > operations > > > can only apply on the time attributes. > > > > > > The reason why Flink Table & SQL only supports to define watermark on > > > source is that this can allow us to do per-partition watermark, source > > idle > > > and simplify things. > > > There are also some discussion about "disable arbitrary watermark > > assigners > > > in the middle of a pipeline in DataStream" in this JIRA issue comments. > > > > > > Best, > > > Jark > > > > > > [1]: https://issues.apache.org/jira/browse/FLINK-11286 > > > > > > > > > On Tue, 28 Apr 2020 at 09:28, Kurt Young <ykt...@gmail.com> wrote: > > > > > > > The current behavior is later. Flink gets time attribute column from > > > source > > > > table, and tries to analyze and keep > > > > the time attribute column as much as possible, e.g. simple projection > > or > > > > filter which doesn't effect the column > > > > will keep the time attribute, window aggregate will generate its own > > time > > > > attribute if you select window_start or > > > > window_end. But you're right, sometimes framework will loose the > > > > information about time attribute column, and > > > > after that, some operations will throw exception. > > > > > > > > Best, > > > > Kurt > > > > > > > > > > > > On Tue, Apr 28, 2020 at 7:45 AM Jungtaek Lim < > > > kabhwan.opensou...@gmail.com > > > > > > > > > wrote: > > > > > > > > > Hi devs, > > > > > > > > > > I'm interesting about the new change on FLIP-66 [1], because if I > > > > > understand correctly, Flink hasn't been having event-time timestamp > > > field > > > > > (column) as a part of "normal" schema, and FLIP-66 tries to change > > it. > > > > > > > > > > That sounds as the column may be open for modification, like rename > > > > (alias) > > > > > or some other operations, or even be dropped via projection. Will > > such > > > > > operations affect event-time timestamp for the record? If you have > an > > > > idea > > > > > about how Spark Structured Streaming works with watermark then you > > > might > > > > > catch the point. > > > > > > > > > > Maybe the question could be reworded as, does the definition of > event > > > > time > > > > > timestamp column on DDL only project to the source definition, or > it > > > will > > > > > carry over the entire query and let operator determine such column > as > > > > > event-time timestamp. (SSS works as latter.) I think this is a huge > > > > > difference, as for me it's like stability vs flexibility, and > > there're > > > > > drawbacks on latter (there're also drawbacks on former as well, but > > > > > computed column may cover up). > > > > > > > > > > Thanks in advance! > > > > > Jungtaek Lim (HeartSaVioR) > > > > > > > > > > 1. > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-66%3A+Support+Time+Attribute+in+SQL+DDL > > > > > > > > > > > > > > >