Thanks Jane for clarifying this. I'm ok with keeping the current hint
pattern with the table name specified by key. Just thinking if there
is another simpler way to define the hint. What if an omitted key hint
only applied for tables from the current query block?


Thanks.


Best,
Zakelly

On Wed, Oct 11, 2023 at 11:25 PM Jane Chan <qingyue....@gmail.com> wrote:
>
> Thank you all for the valuable feedback.
>
> There is a difference when users incorrectly use hints, and there are two
> possible scenarios:
> <1> STATE_TTL hint can be applied to the current query block but with an
> invalid key or value. In this case, the validation exception will be
> thrown. The HintOptionChecker will throw exceptions if the options are
> empty or the value is an invalid duration. And JoinHintResolver will throw
> exceptions if the hint key(table name/alias, etc.) does not exist. So does
> for the group aggregate.
>
> <2> STATE_TTL hint cannot be applied to the current query block, e.g.,
> SELECT /*+ STATE_TTL('MyTable' = '2h') */ * a, b, c FROM MyTable. In this
> case, the hint is ignored. This is a by-design behavior according to
> FLIP-229 [1].
>
> I've made the modifications to the FLIP regarding exception handling. I
> would appreciate it if you could review it again.
>
> @Xuyang
>
> >  I notice that using STATE_TTL hints wrongly will not throw any
> > exceptions. But it seems that in the current join hint scenario, if user
> > uses an unknown table name as the chosen side, a validation exception will
> > be thrown. Maybe we should distinguish which exceptions need to be thrown
> > explicitly.
>
>
> You're right; the STATE_TTL hint semantic check should throw exceptions
> like join hints.
>
> @Feng @Yun
>
> > since there is no exception thrown when the STATE hint is set incorrectly,
> > should we somehow show that the TTL setting has taken effect?
>
> For instance, exhibiting the TTL option within the operator's description?
>
>
> We can throw explicit exceptions for scenario #1. For scenario #2, I prefer
> to align the behavior for current query block hints for now (and we may
> open a separate discussion in the future). On the other hand, from the
> implementation aspect, it is not easy to do so. Taking the example of
> deduplication mentioned earlier, the hint is lost before it propagates to
> the FlinkLogicalRank node, making it challenging to capture the exception.
>
> @Zakelly
>
> > would it be possible to provide a simple hint that allows the omission of
> > the key? For example, something like "SELECT /+ STATE_TTL('1h')/", which
> > would specify the TTL for all states in the 'SELECT' clause.
>
>
> I'm afraid the hint key cannot be omitted. E.g., the join operator is a
> TwoInputStreamOperator; if we want to specify different state TTLs for the
> left and right input, we need to inform the planner of the ordinal info
> (LEFT v.s. RIGHT). On the other hand, the SELECT clause may contain several
> query blocks, while the hint scope is designed to apply to the current
> query block [2] to prevent the hint on the outer query block from
> propagating to the inner query block. For the use case where users need to
> specify the TTL for all states in the 'SELECT' clause, it is preferable to
> modify the compiled plan instead of using hints.
>
> @ConradJam
> I share the same opinion with @Yun that this is a nice-to-have feature. Big
> +1 for the follow-up on FLINK-33230. Users can now use the COMPILE PLAN
> statement to serialize the query to a JSON string or file and then check
> the state metadata to ensure the hint is applied.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job
> <https://cwiki.apache.org/confluence/display/FLINK/FLIP-229%3A+Introduces+Join+Hint+for+Flink+SQL+Batch+Job>
> [2]
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#query-hints
> <https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/hints/#query-hints>
>
> Best,
> Jane
>
>
> On Wed, Oct 11, 2023 at 8:49 PM Yun Tang <myas...@live.com> wrote:
>
> > I think showing the TTL for operators is a nice-to-have feature, not a
> > must one in this FLIP. We can still get the information from the operator
> > descriptions.
> >
> > And I think we can continue the TTL showing work based on FLINK-33230 [1].
> >
> > Last but not least, I prefer to throw exceptions if the TTL configuration
> > is mistakenly used as it will affect the data correctness.
> >
> > [1] https://issues.apache.org/jira/browse/FLINK-33230
> >
> > Best
> > Yun Tang
> > ________________________________
> > From: ConradJam <jam.gz...@gmail.com>
> > Sent: Wednesday, October 11, 2023 20:30
> > To: dev@flink.apache.org <dev@flink.apache.org>
> > Subject: Re: Re: [DISCUSS] FLIP-373: Support Configuring Different State
> > TTLs using SQL Hint
> >
> > +1 TTL shows the state ttl for operators in Flink web ui can be know what
> > operator state
> >
> > Zakelly Lan <zakelly....@gmail.com> 于2023年10月11日周三 19:14写道:
> >
> > > Hi Jane,
> > >
> > > The fine-grained TTL management is extremely useful for performance
> > > tuning, so +1 for the idea. I have a minor suggestion: would it be
> > > possible to provide a simple hint that allows the omission of the key?
> > > For example, something like "SELECT /+ STATE_TTL('1h')/", which would
> > > specify the TTL for all states in the 'SELECT' clause.
> > >
> > > And I also share the same concern as Feng. I am wondering if we could
> > > show the state ttl for operators in Flink UI.
> > >
> > >
> > > Best,
> > > Zakelly
> > >
> > > On Wed, Oct 11, 2023 at 1:27 PM Feng Jin <jinfeng1...@gmail.com> wrote:
> > > >
> > > > Hi Jane,
> > > >
> > > > Thank you for providing this explanation.
> > > >
> > > > Another small question, since there is no exception thrown when the
> > STATE
> > > > hint is set incorrectly,
> > > > should we somehow show that the TTL setting has taken effect?
> > > > For instance, exhibiting the TTL option within the operator's
> > > description?
> > > >
> > > > Best,
> > > > Feng
> > > >
> > > > On Tue, Oct 10, 2023 at 7:51 PM Xuyang <xyzhong...@163.com> wrote:
> > > >
> > > > > Hi, Jane.
> > > > >
> > > > >
> > > > > I think this syntax will be easier for users to set operator ttl. So
> > > big
> > > > > +1. I left some minor comments here.
> > > > >
> > > > >
> > > > > I notice that using STATE_TTL hints wrongly will not throw any
> > > exceptions.
> > > > > But it seems that in the current join hint scenario,
> > > > > if user uses an unknown table name as the chosen side, a validation
> > > > > exception will be thrown.
> > > > > Maybe we should distinguish which exceptions need to be thrown
> > > explicitly.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > >     Best!
> > > > >     Xuyang
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > At 2023-10-10 18:23:55, "Jane Chan" <qingyue....@gmail.com> wrote:
> > > > > >Hi Feng,
> > > > > >
> > > > > >Thank you for your valuable comments. The reason for not including
> > the
> > > > > >scenarios above is as follows:
> > > > > >
> > > > > >For <1>, the automatically inferred stateful operators are not
> > easily
> > > > > >expressible in SQL. This issue was discussed in FLIP-292, and
> > besides
> > > > > >ChangelogNormalize, SinkUpsertMateralizer also faces the same
> > problem.
> > > > > >
> > > > > >For <2> and <3>, the challenge lies in internal implementation.
> > > During the
> > > > > >default_rewrite phase, the row_number expression in LogicalProject
> > is
> > > > > >transformed into LogicalWindow by Calcite's
> > > > > >CoreRules#PROJECT_TO_LOGICAL_PROJECT_AND_WINDOW. However,
> > > CalcRelSplitter
> > > > > >does not pass the hints as an input argument when creating
> > > LogicalWindow,
> > > > > >resulting in the loss of the hint at this step. To support this, we
> > > may
> > > > > >need to rewrite some optimization rules in Calcite, which could be a
> > > > > >follow-up work if required.
> > > > > >
> > > > > >Best,
> > > > > >Jane
> > > > > >
> > > > > >On Tue, Oct 10, 2023 at 1:40 AM Feng Jin <jinfeng1...@gmail.com>
> > > wrote:
> > > > > >
> > > > > >> Hi Jane,
> > > > > >>
> > > > > >> Thank you for proposing this FLIP.
> > > > > >>
> > > > > >> I believe that this FLIP will greatly enhance the flexibility of
> > > setting
> > > > > >> state, and by setting different operators' TTL, it can also
> > > increase job
> > > > > >> stability, especially in regular join scenarios.
> > > > > >> The parameter design is very concise, big +1 for this, and it is
> > > also
> > > > > >> relatively easy to use for users.
> > > > > >>
> > > > > >>
> > > > > >> I have a small question: in the FLIP, it only mentions join and
> > > group.
> > > > > >> Should we also consider other scenarios?
> > > > > >>
> > > > > >> 1. the auto generated deduplicate operator[1].
> > > > > >> 2. the deduplicate query[2].
> > > > > >> 3. the topN query[3].
> > > > > >>
> > > > > >> [1]
> > > > > >>
> > > > > >>
> > > > >
> > >
> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/config/#table-exec-source-cdc-events-duplicate
> > > > > >> [2]
> > > > > >>
> > > > > >>
> > > > >
> > >
> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/deduplication/
> > > > > >> [3]
> > > > > >>
> > > > > >>
> > > > >
> > >
> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/topn/
> > > > > >>
> > > > > >>
> > > > > >> Best,
> > > > > >> Feng
> > > > > >>
> > > > > >> On Sun, Oct 8, 2023 at 5:53 PM Jane Chan <qingyue....@gmail.com>
> > > wrote:
> > > > > >>
> > > > > >> > Hi devs,
> > > > > >> >
> > > > > >> > I'd like to initiate a discussion on FLIP-373: Support
> > Configuring
> > > > > >> > Different State TTLs using SQL Hint [1]. This proposal is on top
> > > of
> > > > > the
> > > > > >> > FLIP-292 [2] to address typical scenarios with unambiguous
> > > semantics
> > > > > and
> > > > > >> > hint propagation.
> > > > > >> >
> > > > > >> > I'm looking forward to your opinions!
> > > > > >> >
> > > > > >> >
> > > > > >> > [1]
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-373%3A+Support+Configuring+Different+State+TTLs+using+SQL+Hint
> > > > > >> > [2]
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-292%3A+Enhance+COMPILED+PLAN+to+support+operator-level+state+TTL+configuration
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > Jane
> > > > > >> >
> > > > > >>
> > > > >
> > >
> >
> >
> > --
> > Best
> >
> > ConradJam
> >

Reply via email to