Re: [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

Yun Tang Thu, 23 Mar 2023 23:51:31 -0700

Hi,

From my point of view, I am a bit against using SQL hint to set state TTL as 
FlinkSQL could be translated to several stateful operators. If we want to let 
different state could have different TTL configs within one operator, the SQL 
hint solution could not work. A better way is to allow a graphical IDE to 
display the stateful operators and let users configure them. And the IDE 
submits the json plan to Flink to run jobs.

For the details of the structure of ExecNodes, since the state name is unique 
in the underlying state layer, shall we introduce the "index" tag to identify 
the state config?
What will happen with the conditions below:
1st run:
   {
     "index": 0,
     "ttl": "259200000 ms",
     "name": "join-lef-state"
   },
   {
     "index": 1,
     "ttl": "86400000 ms",
     "name": "join-right-state"
   }

2nd run:
   {
     "index": 0,
     "ttl": "86400000 ms",
     "name": "join-right-state"
   },
   {
     "index": 1,
     "ttl": "259200000 ms",
     "name": "join-lef-state"
   }

Best
Yun Tang
________________________________
From: Jane Chan <[email protected]>
Sent: Friday, March 24, 2023 11:57
To: [email protected] <[email protected]>
Subject: Re: [DISCUSS] FLIP-292: Support configuring state TTL at operator 
level for Table API & SQL programs

Hi Shammon and Shuo,

Thanks for your valuable comments!

Some thoughts:

@Shuo
> I think it's more properly to say that hint does not affect the
equivalenceof execution plans (hash agg vs sort agg), not the equivalence
of execution
results, e.g., users can set 'scan.startup.mode' for kafka connector by
dynamic table option, which
also "intervene in the calculation of data results".

IMO, the statement that "hint should not interfere with the calculation
results", means it should not interfere with internal computation. On the
other hand, 'scan.startup.mode' interferes with the ingestion of the data.
I think these two concepts are different, but of course, this is just my
opinion and welcome other views.

> I think the final shape of state ttl configuring may like the that,
userscan define operator state ttl using SQL HINT (assumption...), but it
may
affects more than one stateful operators inside the same query block, then
users can further configure a specific one by modifying the compiled json
plan...

Setting aside the issue of semantics, setting TTL from a higher level seems
to be attractive. This means that users only need to configure
'table.exec.state.ttl' through the existing hint syntax to achieve the
effect. Everything is a familiar formula. But is it really the case? Hints
apply to a very broad range. Let me give an example.

Suppose a user wants to set different TTLs for the two streams in a stream
join query. Where should the hints be written?

-- the original query before configuring state TTL
create temporary view view1 as select .... from my_table_1;
create temporary view view2 as select .... from my_table_2;
create temporary view joined_view as
select view1.*, view2.* from my_view_1 a join my_view_2 b on a.join_key =
b.join_key;

Option 1: declaring hints at the very beginning of the table scan

-- should he or she write hints when declaring the first temporary view?
create temporary view view1 as select .... from my_table_1
/*+(OPTIONS('table.exec.state.ttl'
= 'foo'))*/;
create temporary view view2 as select .... from my_table_2
/*+(OPTIONS('table.exec.state.ttl'
= 'bar'))*/;
create temporary view joined_view as
select view1.*, view2.* from my_view_1 a join my_view_2 b on a.join_key =
b.join_key;

Option 2: declaring hints when performing the join

-- or should he or she write hints when declaring the join temporary view?
create temporary view view1 as select .... from my_table_1;
create temporary view view2 as select .... from my_table_2;
create temporary view joined_view as
select view1.*, view2.* from my_view_1 /*+(OPTIONS('table.exec.state.ttl' =
'foo'))*/ a join my_view_2 /*+(OPTIONS('table.exec.state.ttl' = 'bar'))*/ b
on a.join_key = b.join_key;

From the user's point of view, does he or she needs to care about the
difference between these two kinds of style? Users might think the two may
be equivalent; but in reality, as developers, how do we define the range in
which hint starts and ends to take effect?

Consider the following two assumptions

1. Assuming the hint takes effect from the moment it is declared and
applies to any subsequent stateful operators until it is overridden by a
new hint.
If this is the assumption, it's clear that Option 1 and Option 2 are
different because a ChangelogNormalize node can appear between scan and
join. Meanwhile, which stream's TTL to apply to the following query after
the stream join? It is unclear if the user does not explicitly set it.
Should the engine make a random decision?

2. Assuming that the scope of the hint only applies to the current query
block and does not extend to the next operator.
In this case, the first way of setting the hint will not work because it
cannot be brought to the join operator. Users must choose the second way to
configure. Are users willing to remember this strange constraint on SQL
writing style? Does this indicate a new learning cost?

The example above is used to illustrate that while this approach may seem
simple and direct, it actually has many limitations and may produce
unexpected behavior. Will users still find it attractive? IMO *hints only
work for a very limited situation where the query is very simple, and its
scope is more coarse and not operator-level*. Maybe it deserves another
FLIP to discuss whether we need a multiple-level state TTL configuration
mechanism and how to properly implement it.

@Shammon
> Generally, Flink jobs support two types
of submission: SQL and jar. If users want to use `TTL on Operator` for SQL
jobs, they need to edit the json file which is not supported by general job
submission systems such as flink sql-client, apache kyuubi, apache
streampark and .etc. Users need to download the file and edit it manually,
but they may not have the permissions to the storage system such as HDFS in
a real production environment. From this perspective, I think it is
necessary to provide a way similar to
hits that users can configure the `TTL on Operator` in their sqls which
help users to use it conveniently.

IIUC, SQL client supports the statement "EXECUTE PLAN
'file:/foo/bar/example.json'". While I think there is not much evidence to
say we should choose to use hints, just because users cannot touch their
development environment. As a reply to @Shuo,  the TTL set through hint way
is not at the operator level. And whether it is really "convenient" needs
more discussion.

> I agree with @Shuo's idea that for complex cases, users can combine hits
and `json plan` to configure `TTL on Operator` better.

Suppose users can configure TTL through
<1> SET 'table.exec.state.ttl' = 'foo';
<2> Modify the compiled JSON plan;
<3> Use hints (personally I'm strongly against this way, but let's take it
into consideration).
IMO if the user can configure the same parameter in so many ways, then the
complex case only makes things worse. Who has higher priority and who
overrides who?

Best,
Jane

On Fri, Mar 24, 2023 at 11:00 AM Shammon FY <[email protected]> wrote:

> Hi jane
>
> Thanks for initializing this discussion. Configure TTL per operator can
> help users manage state more effectively.
>
> I think the `compiled json plan` proposal may need to consider the impact
> on the user's submission workflow. Generally, Flink jobs support two types
> of submission: SQL and jar. If users want to use `TTL on Operator` for SQL
> jobs, they need to edit the json file which is not supported by general job
> submission systems such as flink sql-client, apache kyuubi, apache
> streampark and .etc. Users need to download the file and edit it manually,
> but they may not have the permissions to the storage system such as HDFS in
> a real production environment.
>
> From this perspective, I think it is necessary to provide a way similar to
> hits that users can configure the `TTL on Operator` in their sqls which
> help users to use it conveniently. At the same time, I agree with @Shuo's
> idea that for complex cases, users can combine hits and `json plan` to
> configure `TTL on Operator` better. What do you think? Thanks
>
>
> Best,
> Shammon FY
>
>
> On Thu, Mar 23, 2023 at 9:58 PM Shuo Cheng <[email protected]> wrote:
>
> > Correction: “users can set 'scan.startup.mode' for kafka connector” ->
> > “users
> > can set 'scan.startup.mode' for kafka connector by dynamic table option”
> >
> > Shuo Cheng <[email protected]>于2023年3月23日 周四21:50写道：
> >
> > > Hi Jane,
> > > Thanks for driving this, operator level state ttl is absolutely a
> desired
> > > feature. I would share my opinion as following:
> > >
> > > If the scope of this proposal is limited as an enhancement for compiled
> > > json plan, it makes sense. I think it does not conflict with
> configuring
> > > state ttl
> > > in other ways, e.g., SQL HINT or something else, because they just work
> > in
> > > different level, SQL Hint works in the exact entrance of SQL API, while
> > > compiled json plan is the intermediate results for SQL.
> > > I think the final shape of state ttl configuring may like the that,
> users
> > > can define operator state ttl using SQL HINT (assumption...), but it
> may
> > > affects more than one stateful operators inside the same query block,
> > then
> > > users can further configure a specific one by modifying the compiled
> json
> > > plan...
> > >
> > > In a word, this proposal is in good shape as an enhancement for
> compiled
> > > json plan, and it's orthogonal with other ways like SQL Hint which
> works
> > in
> > > a higher level.
> > >
> > >
> > > Nips:
> > >
> > > > "From the SQL semantic perspective, hints cannot intervene in the
> > > calculation of data results."
> > > I think it's more properly to say that hint does not affect the
> > > equivalence of execution plans (hash agg vs sort agg), not the
> > equivalence
> > > of execution results, e.g., users can set 'scan.startup.mode' for kafka
> > > connector, which also "intervene in the calculation of data results".
> > >
> > > Sincerely,
> > > Shuo
> > >
> > > On Tue, Mar 21, 2023 at 7:52 PM Jane Chan <[email protected]>
> wrote:
> > >
> > >> Hi devs,
> > >>
> > >> I'd like to start a discussion on FLIP-292: Support configuring state
> > TTL
> > >> at operator level for Table API & SQL programs [1].
> > >>
> > >> Currently, we only support job-level state TTL configuration via
> > >> 'table.exec.state.ttl'. However, users may expect a fine-grained state
> > TTL
> > >> control to optimize state usage. Hence we propose to
> > serialize/deserialize
> > >> the state TTL as metadata of the operator's state to/from the compiled
> > >> JSON
> > >> plan, to achieve the goal that specifying different state TTL when
> > >> transforming the exec node to stateful operators.
> > >>
> > >> Look forward to your opinions!
> > >>
> > >> [1]
> > >>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240883951
> > >>
> > >> Best Regards,
> > >> Jane Chan
> > >>
> > >
> >
>

Re: [DISCUSS] FLIP-292: Support configuring state TTL at operator level for Table API & SQL programs

Reply via email to