Re: [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

Shammon FY Thu, 23 Mar 2023 20:00:22 -0700

Hi jane

Thanks for initializing this discussion. Configure TTL per operator can
help users manage state more effectively.


I think the `compiled json plan` proposal may need to consider the impact
on the user's submission workflow. Generally, Flink jobs support two types
of submission: SQL and jar. If users want to use `TTL on Operator` for SQL
jobs, they need to edit the json file which is not supported by general job
submission systems such as flink sql-client, apache kyuubi, apache
streampark and .etc. Users need to download the file and edit it manually,
but they may not have the permissions to the storage system such as HDFS in
a real production environment.

>From this perspective, I think it is necessary to provide a way similar to
hits that users can configure the `TTL on Operator` in their sqls which
help users to use it conveniently. At the same time, I agree with @Shuo's
idea that for complex cases, users can combine hits and `json plan` to
configure `TTL on Operator` better. What do you think? Thanks


Best,
Shammon FY


On Thu, Mar 23, 2023 at 9:58 PM Shuo Cheng <njucs...@gmail.com> wrote:

> Correction: “users can set 'scan.startup.mode' for kafka connector” ->
> “users
> can set 'scan.startup.mode' for kafka connector by dynamic table option”
>
> Shuo Cheng <njucs...@gmail.com>于2023年3月23日 周四21:50写道：
>
> > Hi Jane,
> > Thanks for driving this, operator level state ttl is absolutely a desired
> > feature. I would share my opinion as following:
> >
> > If the scope of this proposal is limited as an enhancement for compiled
> > json plan, it makes sense. I think it does not conflict with configuring
> > state ttl
> > in other ways, e.g., SQL HINT or something else, because they just work
> in
> > different level, SQL Hint works in the exact entrance of SQL API, while
> > compiled json plan is the intermediate results for SQL.
> > I think the final shape of state ttl configuring may like the that, users
> > can define operator state ttl using SQL HINT (assumption...), but it may
> > affects more than one stateful operators inside the same query block,
> then
> > users can further configure a specific one by modifying the compiled json
> > plan...
> >
> > In a word, this proposal is in good shape as an enhancement for compiled
> > json plan, and it's orthogonal with other ways like SQL Hint which works
> in
> > a higher level.
> >
> >
> > Nips:
> >
> > > "From the SQL semantic perspective, hints cannot intervene in the
> > calculation of data results."
> > I think it's more properly to say that hint does not affect the
> > equivalence of execution plans (hash agg vs sort agg), not the
> equivalence
> > of execution results, e.g., users can set 'scan.startup.mode' for kafka
> > connector, which also "intervene in the calculation of data results".
> >
> > Sincerely,
> > Shuo
> >
> > On Tue, Mar 21, 2023 at 7:52 PM Jane Chan <qingyue....@gmail.com> wrote:
> >
> >> Hi devs,
> >>
> >> I'd like to start a discussion on FLIP-292: Support configuring state
> TTL
> >> at operator level for Table API & SQL programs [1].
> >>
> >> Currently, we only support job-level state TTL configuration via
> >> 'table.exec.state.ttl'. However, users may expect a fine-grained state
> TTL
> >> control to optimize state usage. Hence we propose to
> serialize/deserialize
> >> the state TTL as metadata of the operator's state to/from the compiled
> >> JSON
> >> plan, to achieve the goal that specifying different state TTL when
> >> transforming the exec node to stateful operators.
> >>
> >> Look forward to your opinions!
> >>
> >> [1]
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240883951
> >>
> >> Best Regards,
> >> Jane Chan
> >>
> >
>

Re: [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

Reply via email to