Hi Timo,

Thanks for your advice. I totally agree with your suggestion of naming
convention, I will rename these options and update the flip later, thanks
very much.

In our internal implementation we had put these options inside the
`FactoryUtil`, just as you expect.  We have also taken into account the
changes to the CompiledPlan and we have packaged these options
appropriately to minimize intrusiveness and ensure the compatibility to the
`WatermarkPushDownSpec`.

> A hint to the implementation: I would suggest that we add those options
> to `FactoryUtil`. All cross-connector options should end up there.


> Please also consider the changes to the CompiledPlan in your FLIP. This
> change has implications on the JSON format as watermark strategy of
> ExecNode becomes more complex, see WatermarkPushDownSpec

Best
Kui Yuan

Timo Walther <twal...@apache.org> 于2023年2月27日周一 18:05写道:

> Hi Kui Yuan,
>
> thanks for working on this FLIP. Let me also give some comments about
> the proposed changes.
>
> I support the direction of this FLIP about handling these
> watermark-specific properties through options and /*+OPTIONS(...) */ hints.
>
> Regarding naming, I would like to keep the options in sync with existing
> options:
>
>  > 'watermark.emit.strategy'='ON_EVENT'
>
> Let's use lower case (e.g. `on-event`) that matches with properties like
> sink.partitioner [1] or sink.delivery-guarantee [2].
>
>  > 'source.idle-timeout'='1min'
>
> According to FLIP-122 [3], we want to prefix all scan-source related
> properties with `scan.*`. This clearly includes idle-timeout and
> actually also watermark strategies which don't apply for lookup sources.
>
> Summarizing the comments above, we should use the following options:
>
> 'scan.watermark.emit.strategy'='on-event',
> 'scan.watermark.emit.on-event.gap'='10000',
> 'scan.idle-timeout'='1min',
> 'scan.watermark.alignment.group'='alignment-group-1',
> 'scan.watermark.alignment.max-drift'='1min',
> 'scan.watermark.alignment.update-interval'='1s'
>
> I know that this makes the keys even longer, but given that those
> options are for power users this should be acceptable. It also clearly
> indicates which options are for sinks, scans, and lookups. This
> potentially also helps in allow lists.
>
> A hint to the implementation: I would suggest that we add those options
> to `FactoryUtil`. All cross-connector options should end up there.
>
> Please also consider the changes to the CompiledPlan in your FLIP. This
> change has implications on the JSON format as watermark strategy of
> ExecNode becomes more complex, see WatermarkPushDownSpec [4].
>
> Regards,
> Timo
>
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.11/dev/table/connectors/kafka.html#sink-partitioner
> [2]
>
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/table/kafka/#sink-delivery-guarantee
> [3]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory
> [4]
>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/java/org/apache/flink/table/planner/plan/abilities/source/WatermarkPushDownSpec.java
>
>
> On 24.02.23 04:55, kui yuan wrote:
> > Hi all,
> >
> > I have updated the flip according to the discussion, and we will extend
> the
> > watermark-related features with both table options and 'OPTIONS' hint,
> like
> > this:
> >
> > ```
> > -- configure in table options
> > CREATE TABLE user_actions (
> >    ...
> >    user_action_time TIMESTAMP(3),
> >    WATERMARK FOR user_action_time AS user_action_time - INTERVAL '5'
> SECOND
> > ) WITH (
> >    'watermark.emit.strategy'='ON_PERIODIC',
> >    ...
> > );
> >
> > -- use 'OPTIONS' hint
> > select ... from source_table /*+ OPTIONS('watermark.emit.strategy'=
> > 'ON_PERIODIC') */
> > ```
> >
> > Does everybody have any other questions?
> >
> > Best
> > Kui Yuan
> >
> > kui yuan <catye...@gmail.com> 于2023年2月23日周四 20:05写道:
> >
> >> Hi all,
> >>
> >> Thanks for all suggestions.
> >>
> >> We will extend the watermark-related features in SQL layer with dynamic
> >> table options and 'OPTIONS' hint, just as everyone expects. I will
> modify
> >> Flip-296 as discussed.
> >>
> >> @Martijn As far as I know, there is no hint interface in the table API,
> >> so we can't use hint in table API directly. if we need to extend the
> hint
> >> interface in the table API, maybe we need another flip. However, if we
> >> extend the watermark-related features in the dynamic table options,
> maybe
> >> we are able to use them indirectly in the table API like this[1]:
> >>
> >> ```
> >> // register a table named "Orders"
> >> tableEnv.executeSql("CREATE TABLE Orders (`user` BIGINT, product STRING,
> >> amount INT) WITH ('watermark.emit.strategy'='ON_EVENT'...)");
> >> ```
> >>
> >> [1]
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/
> >>
> >> Best
> >> Kui Yuan
> >>
> >> Yun Tang <myas...@live.com> 于2023年2月23日周四 17:46写道:
> >>
> >>> Thanks for the warm discussions!
> >>>
> >>> I had an offline discussion with Kui about the replies. I think I could
> >>> give some explanations on the original intention to introduce another
> >>> WATERMARK_PARAMS. If we take a look at the current datastream API, the
> >>> watermark strategy does not belong to any specific connector. And we
> >>> thought the dynamic table options were more like the configurations
> within
> >>> some specific connector.
> >>>
> >>>  From the review comments, I think most people feel good to make it
> part
> >>> of the dynamic table options. I think this is fine if we give more
> clear
> >>> scope definition of the dynamic table options here. And I also agree
> with
> >>> Jingsong's concern about adding SQL syntax which is the most concerning
> >>> part before launching this discussion.
> >>>
> >>> For Martijn's concern, if we accept to make the watermark-related
> options
> >>> part of dynamic table options, the problem becomes another topic: how
> to
> >>> support the dynamic table options in table API, which is deserved to
> create
> >>> another FLIP.
> >>>
> >>> Best
> >>> Yun Tang
> >>> ________________________________
> >>> From: Martijn Visser <martijnvis...@apache.org>
> >>> Sent: Thursday, February 23, 2023 17:14
> >>> To: dev@flink.apache.org <dev@flink.apache.org>
> >>> Subject: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL
> >>>
> >>> Hi,
> >>>
> >>> While I can understand that there's a desire to first focus on solving
> >>> this
> >>> problem for SQL, I do wonder if we should ignore the Table API at this
> >>> point. If we could include the syntax for the Table API, it potentially
> >>> could also be implemented by another contributor without needing to
> create
> >>> another FLIP. If we don't design it right now, my concern is that this
> >>> will
> >>> increase sparsity for the Table API which ultimately hurts adoption.
> >>>
> >>> With regards to the syntax, I have a preference to solve this via the
> >>> connector options (e.g. like you can currently specify things as
> >>> scan.startup.specific-offsets or scan.bounded.mode for the Kafka
> >>> connector). You could still use the dynamic table options to
> override/add
> >>> them.
> >>>
> >>> Best regards,
> >>>
> >>> Martijn
> >>>
> >>> On Thu, Feb 23, 2023 at 7:21 AM Shammon FY <zjur...@gmail.com> wrote:
> >>>
> >>>> Hi kui
> >>>>
> >>>> Thanks for your answer and +1 to yuxia too
> >>>>
> >>>>> we should not bind the watermark-related options to a connector to
> >>> ensure
> >>>> semantic clarity.
> >>>>
> >>>> In my opinion, adding watermark-related options to a connector is much
> >>> more
> >>>> clear. Currently users can define simple watermark strategy in DDL,
> >>> adding
> >>>> more configuration items in connector options is easy to understand
> >>>>
> >>>> Best,
> >>>> Shammon
> >>>>
> >>>>
> >>>> On Thu, Feb 23, 2023 at 10:52 AM Jingsong Li <jingsongl...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Thanks for your proposal.
> >>>>>
> >>>>> +1 to yuxia, consider watermark-related hints as option hints.
> >>>>>
> >>>>> Personally, I am cautious about adding SQL syntax, WATERMARK_PARAMS
> is
> >>>>> also SQL syntax to some extent.
> >>>>>
> >>>>> We can use OPTIONS to meet this requirement if possible.
> >>>>>
> >>>>> Best,
> >>>>> Jingsong
> >>>>>
> >>>>> On Thu, Feb 23, 2023 at 10:41 AM yuxia <luoyu...@alumni.sjtu.edu.cn>
> >>>>> wrote:
> >>>>>>
> >>>>>> Hi, Yuan Kui.
> >>>>>> Thanks for driving it.
> >>>>>> IMO, the 'OPTIONS' hint may be not only specific to the connector
> >>>>> options. Just as a reference, we also have `sink.parallelism`[1] as a
> >>>>> connector options. It enables
> >>>>>> user to specific the writer's parallelism dynamically per-query.
> >>>>>>
> >>>>>> Personally, I perfer to consider watermark-related hints as option
> >>>>> hints. So, user can define a default watermark strategy for the
> table,
> >>>> and
> >>>>> if user dosen't needed to changes it, they need to do nothing in
> their
> >>>>> query instead of specific it ervery time.
> >>>>>>
> >>>>>> [1]
> >>>>>
> >>>>
> >>>
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/connectors/table/filesystem/#sink-parallelism
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Yuxia
> >>>>>>
> >>>>>> Best regards,
> >>>>>> Yuxia
> >>>>>>
> >>>>>> ----- 原始邮件 -----
> >>>>>> 发件人: "kui yuan" <catye...@gmail.com>
> >>>>>> 收件人: "dev" <dev@flink.apache.org>
> >>>>>> 抄送: "Jark Wu" <imj...@gmail.com>
> >>>>>> 发送时间: 星期三, 2023年 2 月 22日 下午 10:08:11
> >>>>>> 主题: Re: [DISCUSS] FLIP-296: Watermark options for table API & SQL
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> Thanks for the lively discussion and I will respond to these
> >>> questions
> >>>>> one
> >>>>>> by one. However, there are also some common questions and I will
> >>> answer
> >>>>>> together.
> >>>>>>
> >>>>>> @郑 Thanks for your reply. The features mentioned in this flip are
> >>> only
> >>>>> for
> >>>>>> those source connectors that implement the SupportsWatermarkPushDown
> >>>>>> interface, generating watermarks in other graph locations is not in
> >>> the
> >>>>>> scope of this discussion. Perhaps another flip can be proposed
> >>> later to
> >>>>>> implement this feature.
> >>>>>>
> >>>>>> @Shammon Thanks for your reply. In Flip-296, a rejected alternative
> >>> is
> >>>>>> adding watermark related options in the connector options,we believe
> >>>> that
> >>>>>> we should not bind the watermark-related options to a connector to
> >>>> ensure
> >>>>>> semantic clarity.
> >>>>>>
> >>>>>>> What will happen if we add watermark related options in `the
> >>>> connector
> >>>>>>> options`? Will the connector ignore these options or throw an
> >>>>> exception?
> >>>>>>> How can we support this?
> >>>>>>
> >>>>>> If user defines different watermark configurations for one table in
> >>> two
> >>>>>> places, I tend to prefer the first place would prevail, but  we can
> >>>> also
> >>>>>> throw exception or just print logs to prompt the user, which are
> >>>>>> implementation details.
> >>>>>>
> >>>>>>> If one table is used by two operators with different watermark
> >>>> params,
> >>>>>>> what will happen?
> >>>>>>
> >>>>>> @Martijn Thanks for your reply.  I'm sorry that we are not
> >>> particularly
> >>>>>> accurate, this hint is mainly for SQL,  not table API.
> >>>>>>
> >>>>>>> While the FLIP talks about watermark options for Table API & SQL,
> >>> I
> >>>>> only
> >>>>>>> see proposed syntax for SQL, not for the Table API. What is your
> >>>>> proposal
> >>>>>>> for the Table API
> >>>>>>
> >>>>>> @Jane Thanks for your reply. For the first question, If the user
> >>> uses
> >>>>> this
> >>>>>> hint on those sourse that does not implement the
> >>>>> SupportsWatermarkPushDown
> >>>>>> interface, it will be completely invalid. The task will run as
> >>> normal
> >>>> as
> >>>>> if
> >>>>>> the hint had not been used.
> >>>>>>
> >>>>>>> What's the behavior if there are multiple table sources, among
> >>> which
> >>>>>>> some do not support `SupportsWatermarkPushDown`?
> >>>>>>
> >>>>>> @Jane feedback that 'WATERMARK_PARAMS' is difficult to remember,
> >>>> perhaps
> >>>>>> the naming issue can be put to the end of the discussion, because
> >>> more
> >>>>>> people like @Martijn @Shuo are considering whether these
> >>> configurations
> >>>>>> should be put into the DDL or the 'OPTIONS' hint. Here's what I
> >>>>>> think, Putting these configs into DDL or putting them into 'OPTIONS'
> >>>> hint
> >>>>>> is actually the same thing, because the 'OPTIONS' hint is mainly
> >>> used
> >>>> to
> >>>>>> configure the properties of conenctor. The reason why I want to use
> >>> a
> >>>> new
> >>>>>> hint is to make sure the semantics clear, in my opinion the
> >>>> configuration
> >>>>>> of watermark should not be mixed up with connector. However, a new
> >>> hint
> >>>>>> does make it more difficult to use to some extent, for example,
> >>> when a
> >>>>> user
> >>>>>> uses both 'OPTIONS' hint and 'WATERMARK_PARAMS' hint. For this
> >>> point,
> >>>>> maby
> >>>>>> it is more appropriate to use uniform 'OPTIONS' hint.
> >>>>>> On the other hand, if we enrich more watermark option keys in
> >>> 'OPTIONS'
> >>>>>> hints, The question will be what we treat the definatrions of
> >>>> 'OPTIONS'
> >>>>>> hint, is this only specific to the connector options or could be
> >>> more?
> >>>>>> Maybe @Jark could share more insights here. In my opion, 'OPTIONS'
> >>> is
> >>>>> only
> >>>>>> related to the connector options, which is not like the gernal
> >>>> watermark
> >>>>>> options.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> Shuo Cheng <njucs...@gmail.com> 于2023年2月22日周三 19:17写道:
> >>>>>>
> >>>>>>> Hi Kui,
> >>>>>>>
> >>>>>>> Thanks for driving the discussion. It's quite useful to introduce
> >>>>> Watermark
> >>>>>>> options. I have some questions:
> >>>>>>>
> >>>>>>> What kind of hints is "WATERMARK_PARAMS"?
> >>>>>>> Currently, we have two kinds of hints in Flink: Dynamic Table
> >>>> Options &
> >>>>>>> Query Hints. As described in the Flip, "WATERMARK_PARAMS" is more
> >>>> like
> >>>>>>> Dynamic Table Options. So two questions arise here:
> >>>>>>>
> >>>>>>> 1)  Are these watermark options to be exposed as connector WITH
> >>>>> options? Aa
> >>>>>>> described in SQL Hints doc[1],  "Dynamic Table Options allow to
> >>>>> specify or
> >>>>>>> override table options dynamically", which implies that these
> >>> options
> >>>>> can
> >>>>>>> also be configured in WITH options.
> >>>>>>>
> >>>>>>> 2)  Do we really need a new hint name like 'WATERMARK_PARAMS',
> >>> table
> >>>>>>> options use "OPTIONS" as hint name, like '/*+
> >>>>>>> OPTIONS('csv.ignore-parse-errors'='true') */', maybe we can enrich
> >>>> more
> >>>>>>> table option keys for watermark, e.g., /*+
> >>>>>>> OPTIONS('watermark.emit-strategy'='ON_PERIODIC') */.
> >>>>>>>
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/table/sql/queries/hints/
> >>>>>>>
> >>>>>>> On Wed, Feb 22, 2023 at 10:22 AM kui yuan <catye...@gmail.com>
> >>>> wrote:
> >>>>>>>
> >>>>>>>> Hi devs,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> I'd like to start a discussion thread for FLIP-296[1]. This
> >>> comes
> >>>>> from an
> >>>>>>>> offline discussion with @Yun Tang, and we hope to enrich table
> >>> API
> >>>> &
> >>>>> SQL
> >>>>>>> to
> >>>>>>>> support many watermark-related features which were only
> >>> implemented
> >>>>> at
> >>>>>>> the
> >>>>>>>> datastream API level.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Basically, we want to introduce watermark options in table API &
> >>>> SQL
> >>>>> via
> >>>>>>>> SQL hint named 'WATERMARK_PARAMS' to support features:
> >>>>>>>>
> >>>>>>>> 1、Configurable watermark emit strategy
> >>>>>>>>
> >>>>>>>> 2、Dealing with idle sources
> >>>>>>>>
> >>>>>>>> 3、Watermark alignment
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Last but not least, thanks to Qingsheng and Jing Zhang for the
> >>>>> initial
> >>>>>>>> reviews.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Looking forward to your thoughts and any feedback is
> >>> appreciated!
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240884405
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best
> >>>>>>>>
> >>>>>>>> Yuan Kui
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>
>

Reply via email to