Thanks for the FLIP, it helps a lot for us to develop features like
VECTOR_SEARCH. But I have some questions about the FLIP:

1. Timing of Option Consumption for SEARCH_KEY Parameters
For the FOR SYSTEM_TIME AS OF syntax, the planner leverages hints and
catalog tables to load the table scan during the toRel phase. However, if
users use the SEARCH_KEY function, is the planner able to consume these
options at the same early stage?

2.Handling Column Name Conflicts in SEARCH_KEY Output Schema
What is the output schema behavior for the SEARCH_KEY function when the
left and right tables have columns with conflicting names?
How can users resolve these name conflicts to ensure unambiguous access to
the desired columns?

3. Details on Correlated PTFs
Could you elaborate on correlated PTFs? What is the API design for
implementing correlated PTFs? How does a PTF retrieve the required model
and lookup function during execution?

4. Specifying Literal Values in SEARCH_KEY Function
How can users include literal values in the SEARCH_KEY function parameters?
The FOR SYSTEM_TIME AS OF syntax allows users to define complex temporal
conditions. Does SEARCH_KEY support equivalent flexibility for specifying
dynamic or literal values in its parameters?

Best,
Shengkai



Hao Li <h...@confluent.io.invalid> 于2025年4月3日周四 00:12写道:

> Hi Timo,
>
> Any question I have is what's the SEARCH_KEY result schema you have in
> mind? Can it output multiple rows for every row in the left table or it
> needs to pack the result in a single row as an array?
>
> Thanks,
> Hao
>
> On Mon, Mar 24, 2025 at 10:20 AM Hao Li <h...@confluent.io> wrote:
>
> > Thanks Timo for the FLIP! This is a great improvement to the FLINK sql
> > syntax around tables. I have two clarification questions:
> >
> > 1. For SEARCH_KEY
> > ```
> > SELECT *
> > FROM
> >   t_other,
> >   LATERAL SEARCH_KEY(
> >     input => t,
> >     on_key => DESCRIPTOR(k),
> >     lookup => t_other.name,
> >     options => MAP[
> >       'async', 'true',
> >       'retry-predicate', 'lookup_miss',
> >       'retry-strategy', 'fixed_delay',
> >       'fixed-delay'='10s'
> >     ]
> >   )
> > ```
> > Table `t` needs to be an existing `LookupTableSource` [1], right? And we
> > will rewrite it to `StreamPhysicalLookupJoin`  [2] or similar operator
> > during the physical optimization phase.
> > Also to support passing options, we need to extend `LookupContext` [3] to
> > have a `getOptions` or `getRuntimeOptions` method?
> >
> > 2. For FROM_CHANGELOG
> > ```
> > SELECT * FROM FROM_CHANGELOG(s) AS t;
> > ```
> > Do we need to introduce a `DataStream` resource in sql first?
> >
> >
> > Hao
> >
> >
> >
> >
> > [1]
> >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/source/LookupTableSource.java
> > [2]
> >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/nodes/physical/stream/StreamPhysicalLookupJoin.scala#L41
> > [3]
> >
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/source/LookupTableSource.java#L82
> >
> >
> > On Fri, Mar 21, 2025 at 6:25 AM Timo Walther <twal...@apache.org> wrote:
> >
> >> Hi everyone,
> >>
> >> I would like to start a discussion about FLIP-517: Better Handling of
> >> Dynamic Table Primitives with PTFs [1].
> >>
> >> In the past months, I have spent a significant amount of time with SQL
> >> semantics and the SQL standard around PTFs, when designing and
> >> implementing FLIP-440 [2]. For those of you that have not taken a look
> >> into the standard, the concept of Polymorphic Table Functions (PTF)
> >> enables syntax for implementing custom SQL operators. In my opinion,
> >> they are kind of a revolution in the SQL language. PTFs can take scalar
> >> values, tables, models (in Flink), and column lists as arguments. With
> >> these primitives, we can further evolve shortcomings in the Flink SQL
> >> language by leveraging syntax and semantics.
> >>
> >> I would like introduce a couple of built-in PTFs with the goal to make
> >> the handling of dynamic tables easier for users. Once users understand
> >> how a PTF works, they can easily select from a list of functions to
> >> approach a table for snapshots, changelogs, or searching.
> >>
> >> The FLIP proposes:
> >>
> >> SNAPSHOT()
> >> SEARCH_KEY()
> >> TO_CHANGELOG()
> >> FROM_CHANGELOG()
> >>
> >> I'm aware that this is a delicate topic, and might lead to controversial
> >> discussions. I hope with concise naming and syntax the benefit over the
> >> existing syntax becomes clear.
> >>
> >> There are more useful PTFs to come, but those are the ones that I
> >> currently see as the most fundamental ones to tell a round story around
> >> Flink SQL.
> >>
> >> Looking forward to your feedback.
> >>
> >> Thanks,
> >> Timo
> >>
> >> [1]
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-517%3A+Better+Handling+of+Dynamic+Table+Primitives+with+PTFs
> >> [2]
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
> >>
> >
>

Reply via email to