+1 to use PTF.

I would like to raise a consideration regarding the usage implementation:
Would it be necessary to allow users to utilize the CREATE FUNCTION
statement for registering the PTF?

Currently, Flink SQL supports letting external systems register modules and
leverage these modules to centrally manage all function definitions. Given
this architectural approach, I’m curious if the plan involves introducing
additional functions in the future. If so, I would advocate for introducing
a dedicated state module to centralize such management. This would empower
users to:

1. Simply execute the LOAD MODULE command to load the required module, and
2. Directly invoke read_metadata thereafter.

For more details about the module, please refer to this document[1].

Best,
Shengkai

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/modules/

Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月28日周五 00:26写道:

> Just found out that PTF in batch mode is not supported, plz see the dev
> mailing about it [1].
>
> [1] https://lists.apache.org/thread/ytm9m1qt4pq2q2gjngfktrn8vrlvkf07
>
> BR,
> G
>
>
> On Thu, Mar 27, 2025 at 3:38 PM Gabor Somogyi <gabor.g.somo...@gmail.com>
> wrote:
>
> > In the meantime I've just updated the FLIP according to this to be
> > optimistic 🙂
> >
> > BR,
> > G
> >
> > On Thu, Mar 27, 2025 at 2:15 PM Gabor Somogyi <gabor.g.somo...@gmail.com
> >
> > wrote:
> >
> >> Considering all the facts I also +1 on PTF. Even if something is missing
> >> we can add later.
> >>
> >> @Zakelly Lan <zakelly....@gmail.com> @Shengkai Fang are you also on the
> >> same page or have something to add?
> >>
> >> BR,
> >> G
> >>
> >>
> >> On Thu, Mar 27, 2025 at 1:50 PM Lincoln Lee <lincoln.8...@gmail.com>
> >> wrote:
> >>
> >>> +1 for PTF
> >>>
> >>> > Is it possible to describe such function to see the column
> names/types?
> >>>
> >>> Although Flink SQL does not directly support this feature, users can
> >>> achieve
> >>> similar results with the help of `explain` syntax, e.g.
> >>> 'explain select * from read_state_metadata(...)'
> >>>
> >>>
> >>> Best,
> >>> Lincoln Lee
> >>>
> >>>
> >>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月27日周四 20:41写道:
> >>>
> >>> > Hey!
> >>> >
> >>> > I think the PTF approach strikes a great balance in simplicity and
> the
> >>> > capabilities that we get out of it.
> >>> >
> >>> > I think this could be a completely viable alternative to the
> dedicated
> >>> > connector, +1.
> >>> >
> >>> > Cheers,
> >>> > Gyula
> >>> >
> >>> > On Thu, Mar 27, 2025 at 10:37 AM Shengkai Fang <fskm...@gmail.com>
> >>> wrote:
> >>> >
> >>> > > Hi, Gabor.
> >>> > >
> >>> > > > Do I understand correctly that this is 2.x only feature and we
> >>> can't
> >>> > > backport it to 1.x line
> >>> > >
> >>> > > Yes. PTF is only supported in 2.x verison.
> >>> > >
> >>> > > > Is it possible to describe such function to see the column
> >>> names/types?
> >>> > >
> >>> > > Flink SQL doesn't support this feature, but postgres[2] or mysql[1]
> >>> has
> >>> > > similar feature.
> >>> > >
> >>> > > [1]
> >>> https://dev.mysql.com/doc/refman/8.4/en/show-create-procedure.html
> >>> > > [2]
> >>> > >
> >>> > >
> >>> >
> >>>
> https://stackoverflow.com/questions/6898453/show-the-code-of-a-function-procedure-and-trigger-in-postgresql
> >>> > >
> >>> > > Best,
> >>> > > Shengkai
> >>> > >
> >>> > >
> >>> > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月27日周四 16:25写道:
> >>> > >
> >>> > > > Hi Shengkai,
> >>> > > >
> >>> > > > Thanks for your effort with the example, this looks promising.
> >>> > > > I like the fact that users wouldn't need to sweat with complex
> >>> create
> >>> > > table
> >>> > > > statements.
> >>> > > >
> >>> > > > Couple of questions:
> >>> > > > * Do I understand correctly that this is 2.x only feature and we
> >>> can't
> >>> > > > backport it to 1.x line?
> >>> > > > I'm not intended to do any backport, just would like to know the
> >>> > > technical
> >>> > > > constraints.
> >>> > > > * Is it possible to describe such function to see the column
> >>> > names/types?
> >>> > > >
> >>> > > > BR,
> >>> > > > G
> >>> > > >
> >>> > > >
> >>> > > > On Thu, Mar 27, 2025 at 3:17 AM Shengkai Fang <fskm...@gmail.com
> >
> >>> > wrote:
> >>> > > >
> >>> > > > > Many thanks for your reminder, Leonard. Here's the link I
> >>> > mentioned[1].
> >>> > > > >
> >>> > > > > Best,
> >>> > > > > Shengkai
> >>> > > > >
> >>> > > > > [1] https://github.com/apache/flink/pull/26358
> >>> > > > >
> >>> > > > > Leonard Xu <xbjt...@gmail.com> 于2025年3月27日周四 10:05写道:
> >>> > > > >
> >>> > > > > > Your link is broken, Shengkai
> >>> > > > > >
> >>> > > > > > Best,
> >>> > > > > > Leonard
> >>> > > > > >
> >>> > > > > > > 2025年3月27日 10:01,Shengkai Fang <fskm...@gmail.com> 写道:
> >>> > > > > > >
> >>> > > > > > > Hi, All.
> >>> > > > > > >
> >>> > > > > > > I write a simple demo to illustrate my idea. Hope this
> helps.
> >>> > > > > > >
> >>> > > > > > > Best,
> >>> > > > > > > Shengkai
> >>> > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://github.com/apache/flink/compare/master...fsk119:flink:example?expand=1
> >>> > > > > > >
> >>> > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月26日周三
> >>> 15:54写道:
> >>> > > > > > >
> >>> > > > > > >>> I'm fine with a seperate SQL connector for metadata, so
> >>> maybe
> >>> > we
> >>> > > > > could
> >>> > > > > > >> update the FLIP about our discussion?
> >>> > > > > > >>
> >>> > > > > > >> Sorry, I've forgotten this part. Yeah, no matter we choose
> >>> I'm
> >>> > > going
> >>> > > > > to
> >>> > > > > > >> update the FLIP.
> >>> > > > > > >>
> >>> > > > > > >> G
> >>> > > > > > >>
> >>> > > > > > >>
> >>> > > > > > >> On Wed, Mar 26, 2025 at 8:51 AM Gabor Somogyi <
> >>> > > > > > gabor.g.somo...@gmail.com>
> >>> > > > > > >> wrote:
> >>> > > > > > >>
> >>> > > > > > >>> Hi All,
> >>> > > > > > >>>
> >>> > > > > > >>> I've also lack of the knowledge of PTF so I've read just
> >>> the
> >>> > > > > motivation
> >>> > > > > > >>> part:
> >>> > > > > > >>>
> >>> > > > > > >>> "The SQL 2016 standard introduced a way of defining
> custom
> >>> SQL
> >>> > > > > > operators
> >>> > > > > > >>> defined by ISO/IEC 19075-7:2021 (Part 7: Polymorphic
> table
> >>> > > > > functions).
> >>> > > > > > >>> ~200 pages define how this new kind of function can
> >>> consume and
> >>> > > > > produce
> >>> > > > > > >>> tables with various execution properties.
> >>> > > > > > >>> Unfortunately, this part of the standard is not publicly
> >>> > > > available."
> >>> > > > > > >>>
> >>> > > > > > >>> Of course we can take a look at some examples but do we
> >>> really
> >>> > > want
> >>> > > > > to
> >>> > > > > > >>> expose state data with this construct
> >>> > > > > > >>> which is described in ~200 pages and part of the standard
> >>> is
> >>> > not
> >>> > > > > > publicly
> >>> > > > > > >>> available? 🙂
> >>> > > > > > >>> I mean the dataset is couple of rows and the use-case is
> >>> join
> >>> > > with
> >>> > > > > > >> another
> >>> > > > > > >>> table like with state data.
> >>> > > > > > >>> If somebody can give advantages I would buy that but from
> >>> my
> >>> > > > limited
> >>> > > > > > >>> understanding this would be an overkill here.
> >>> > > > > > >>>
> >>> > > > > > >>> BR,
> >>> > > > > > >>> G
> >>> > > > > > >>>
> >>> > > > > > >>>
> >>> > > > > > >>> On Wed, Mar 26, 2025 at 8:28 AM Gyula Fóra <
> >>> > gyula.f...@gmail.com
> >>> > > >
> >>> > > > > > wrote:
> >>> > > > > > >>>
> >>> > > > > > >>>> Hi Zakelly , Shengkai!
> >>> > > > > > >>>>
> >>> > > > > > >>>> I don't know too much about PTFs, it would be
> interesting
> >>> to
> >>> > see
> >>> > > > how
> >>> > > > > > the
> >>> > > > > > >>>> usage would look in practice.
> >>> > > > > > >>>>
> >>> > > > > > >>>> Do you have some mockup/example in mind how the PTF
> would
> >>> look
> >>> > > for
> >>> > > > > > >> example
> >>> > > > > > >>>> when want to:
> >>> > > > > > >>>> - Simply display/aggregate whats in the metadata
> >>> > > > > > >>>> - Join keyed state with some metadata columns
> >>> > > > > > >>>>
> >>> > > > > > >>>> Thanks
> >>> > > > > > >>>> Gyula
> >>> > > > > > >>>>
> >>> > > > > > >>>> On Wed, Mar 26, 2025 at 7:33 AM Zakelly Lan <
> >>> > > > zakelly....@gmail.com>
> >>> > > > > > >>>> wrote:
> >>> > > > > > >>>>
> >>> > > > > > >>>>> Hi everyone,
> >>> > > > > > >>>>>
> >>> > > > > > >>>>> I'm fine with a seperate SQL connector for metadata, so
> >>> maybe
> >>> > > we
> >>> > > > > > could
> >>> > > > > > >>>>> update the FLIP about our discussion? And Shengkai
> >>> provides a
> >>> > > PTF
> >>> > > > > > >>>>> implementation, does that also meet the requirement?
> >>> > > > > > >>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>> Best,
> >>> > > > > > >>>>> Zakelly
> >>> > > > > > >>>>>
> >>> > > > > > >>>>> On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <
> >>> > > > > > >>>> gabor.g.somo...@gmail.com>
> >>> > > > > > >>>>> wrote:
> >>> > > > > > >>>>>
> >>> > > > > > >>>>>> Hi All,
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>> @Zakelly: Gyula summarised it correctly what I meant
> so
> >>> > please
> >>> > > > > treat
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>> content as mine.
> >>> > > > > > >>>>>> As an addition I'm not against to add CLI at all, I'm
> >>> just
> >>> > > > stating
> >>> > > > > > >>>> that
> >>> > > > > > >>>>> in
> >>> > > > > > >>>>>> some cases like this, users would like to have
> >>> > > > > > >>>>>> a self-serving solution where they can provide SQL
> >>> > statements
> >>> > > > > which
> >>> > > > > > >>>> can
> >>> > > > > > >>>>>> trigger alerts automatically.
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>> My personal opinion is that CLI would be beneficial
> for
> >>> > > several
> >>> > > > > > >>>> cases. A
> >>> > > > > > >>>>>> good example is when users want to restart job
> >>> > > > > > >>>>>> from specific Kafka offsets which are persisted in a
> >>> > > savepoint.
> >>> > > > > For
> >>> > > > > > >>>> such
> >>> > > > > > >>>>>> scenario users are more than happy since they
> >>> > > > > > >>>>>> expect manual intervention with full control. So all
> in
> >>> all
> >>> > > one
> >>> > > > > can
> >>> > > > > > >>>> count
> >>> > > > > > >>>>>> on my +1 when CLI FLIP would come up...
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>> BR,
> >>> > > > > > >>>>>> G
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <
> >>> > > > gyula.f...@gmail.com>
> >>> > > > > > >>>> wrote:
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>>> Hi!
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>> @Zakelly Lan <zakelly....@gmail.com>
> >>> > > > > > >>>>>>> I think what Gabor means is that users want to have
> >>> > > predefined
> >>> > > > > SQL
> >>> > > > > > >>>>> scripts
> >>> > > > > > >>>>>>> to perform state analysis tasks to debug/identify
> >>> problems.
> >>> > > > > > >>>>>>> Such as write a SQL script that joins the metadata
> >>> table
> >>> > with
> >>> > > > the
> >>> > > > > > >>>> state
> >>> > > > > > >>>>>>> and
> >>> > > > > > >>>>>>> do some analytics on it.
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>> If we have a meta table then the SQL script that can
> do
> >>> > this
> >>> > > is
> >>> > > > > > >> fixed
> >>> > > > > > >>>>> and
> >>> > > > > > >>>>>>> users can trigger this on demand by simply providing
> a
> >>> new
> >>> > > > > > >> savepoint
> >>> > > > > > >>>>> path.
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>> If we have a different mechanism to extract metadata
> >>> that
> >>> > is
> >>> > > > not
> >>> > > > > > >> SQL
> >>> > > > > > >>>>>>> native
> >>> > > > > > >>>>>>> then manual steps need to be executed and a custom
> SQL
> >>> > script
> >>> > > > > would
> >>> > > > > > >>>> need
> >>> > > > > > >>>>>>> to
> >>> > > > > > >>>>>>> be written that adds the manually extracted metadata
> >>> into
> >>> > the
> >>> > > > > > >> script.
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>> Cheers,
> >>> > > > > > >>>>>>> Gyula
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <
> >>> > > > > zakelly....@gmail.com
> >>> > > > > > >>>
> >>> > > > > > >>>>>>> wrote:
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>>> Hi all,
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> Thanks for your answers! Getting everyone aligned on
> >>> this
> >>> > > > topic
> >>> > > > > > >> is
> >>> > > > > > >>>>>>>> challenging, but it’s definitely worth the effort
> >>> since it
> >>> > > > will
> >>> > > > > > >>>> help
> >>> > > > > > >>>>>>>> streamline things moving forward.
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> @Gabor are you saying that users are using some
> >>> scripts to
> >>> > > > > define
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>> SQL
> >>> > > > > > >>>>>>>> metadata connector and get the information, right?
> If
> >>> so,
> >>> > > > would
> >>> > > > > a
> >>> > > > > > >>>> CLI
> >>> > > > > > >>>>>>> tool
> >>> > > > > > >>>>>>>> be more convenient? It's easy to invoke and can get
> >>> the
> >>> > > result
> >>> > > > > > >>>>> swiftly.
> >>> > > > > > >>>>>>> And
> >>> > > > > > >>>>>>>> there should be some other systems to track the
> >>> checkpoint
> >>> > > > > > >> lineage
> >>> > > > > > >>>> and
> >>> > > > > > >>>>>>>> analyze if there are outliers in metadata (e.g.
> state
> >>> size
> >>> > > of
> >>> > > > > one
> >>> > > > > > >>>>>>> operator)
> >>> > > > > > >>>>>>>> right? Well, maybe I missed something so please
> >>> correct me
> >>> > > if
> >>> > > > > I'm
> >>> > > > > > >>>>> wrong.
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> I think the overall vision in Flink SQL is to
> provide
> >>> a
> >>> > SQL
> >>> > > > > > >> native
> >>> > > > > > >>>>>>>>> environment where we can serve complex use-cases
> >>> like you
> >>> > > > would
> >>> > > > > > >>>>> expect
> >>> > > > > > >>>>>>>> in a
> >>> > > > > > >>>>>>>>> regular database.
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> @Gyula Well, this is a good point. From the
> >>> perspective of
> >>> > > > > > >>>>> comprehensive
> >>> > > > > > >>>>>>>> SQL experience, I'd +1 for treating metadata as
> data.
> >>> > > > Although I
> >>> > > > > > >>>> doubt
> >>> > > > > > >>>>>>> if
> >>> > > > > > >>>>>>>> there is a need for processing metadata, I won't be
> >>> > against
> >>> > > a
> >>> > > > > > >>>> separate
> >>> > > > > > >>>>>>>> connector.
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> Regarding the CLI tool, I still think it’s worth
> >>> > > implementing.
> >>> > > > > > >>>> Such a
> >>> > > > > > >>>>>>> tool
> >>> > > > > > >>>>>>>> could provide savepoint information before resuming
> >>> from a
> >>> > > > > > >>>> savepoint,
> >>> > > > > > >>>>>>> which
> >>> > > > > > >>>>>>>> would enhance the user experience in CLI-based
> >>> workflows.
> >>> > It
> >>> > > > > > >> would
> >>> > > > > > >>>> be
> >>> > > > > > >>>>>>> good
> >>> > > > > > >>>>>>>> if someone could implement this feature. We
> shouldn’t
> >>> > worry
> >>> > > > > about
> >>> > > > > > >>>>>>> whether
> >>> > > > > > >>>>>>>> this tool might be retired in the future. Regardless
> >>> of
> >>> > the
> >>> > > > > > >>>> SQL-based
> >>> > > > > > >>>>>>>> solution we eventually adopt, this capability will
> >>> remain
> >>> > > > > > >> essential
> >>> > > > > > >>>>> for
> >>> > > > > > >>>>>>> CLI
> >>> > > > > > >>>>>>>> users. This is another topic.
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> Best,
> >>> > > > > > >>>>>>>> Zakelly
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>> On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang <
> >>> > > > > > >> fskm...@gmail.com>
> >>> > > > > > >>>>>>> wrote:
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>>> Hi.
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> After reading the doc[1], I think Spark provides a
> >>> > function
> >>> > > > for
> >>> > > > > > >>>>> users
> >>> > > > > > >>>>>>> to
> >>> > > > > > >>>>>>>>> consume the metadata from the savepoint.  In Flink
> >>> SQL,
> >>> > > > similar
> >>> > > > > > >>>>>>>>> functionality is implemented through Polymorphic
> >>> Table
> >>> > > > > > >> Functions
> >>> > > > > > >>>>>>> (PTF) as
> >>> > > > > > >>>>>>>>> proposed in FLIP-440[2]. Below is a code example[3]
> >>> > > > > > >> illustrating
> >>> > > > > > >>>>> this
> >>> > > > > > >>>>>>>>> concept:
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> ```
> >>> > > > > > >>>>>>>>>    public static class ScalarArgsFunction extends
> >>> > > > > > >>>>>>>>> TestProcessTableFunctionBase {
> >>> > > > > > >>>>>>>>>        public void eval(Integer i, Boolean b) {
> >>> > > > > > >>>>>>>>>            collectObjects(i, b);
> >>> > > > > > >>>>>>>>>        }
> >>> > > > > > >>>>>>>>>    }
> >>> > > > > > >>>>>>>>> ```
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> ```
> >>> > > > > > >>>>>>>>> INSERT INTO sink SELECT * FROM f(i => 42, b =>
> >>> > CAST('TRUE'
> >>> > > AS
> >>> > > > > > >>>>>>> BOOLEAN))
> >>> > > > > > >>>>>>>>> ``
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> So we can add a builtin function named
> >>> > > `read_state_metadata`
> >>> > > > to
> >>> > > > > > >>>> read
> >>> > > > > > >>>>>>>>> savepoint data.
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> Best,
> >>> > > > > > >>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL
> >>> > > > > > >>>>>>>>> [2]
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
> >>> > > > > > >>>>>>>>> [3]
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月19日周三
> >>> 18:37写道:
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>>> Hi All!
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> Thank you for the answers and concerns from
> >>> everyone.
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> On the CLI vs State Metadata Connector/Table
> >>> question I
> >>> > > > would
> >>> > > > > > >>>> also
> >>> > > > > > >>>>>>> like
> >>> > > > > > >>>>>>>>> to
> >>> > > > > > >>>>>>>>>> step back a little and look at the bigger picture.
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> I think the overall vision in Flink SQL is to
> >>> provide a
> >>> > > SQL
> >>> > > > > > >>>> native
> >>> > > > > > >>>>>>>>>> environment where we can serve complex use-cases
> >>> like
> >>> > you
> >>> > > > > > >> would
> >>> > > > > > >>>>>>> expect
> >>> > > > > > >>>>>>>>> in a
> >>> > > > > > >>>>>>>>>> regular database.
> >>> > > > > > >>>>>>>>>> Most features, developments in the recent years
> have
> >>> > gone
> >>> > > > > > >> this
> >>> > > > > > >>>>> way.
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> The State Metadata Table would be a natural and
> >>> > > > > > >> straightforward
> >>> > > > > > >>>>> fit
> >>> > > > > > >>>>>>>> here.
> >>> > > > > > >>>>>>>>>> So from my side, +1 for that.
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> However I could understand if we are not ready to
> >>> add a
> >>> > > new
> >>> > > > > > >>>>>>>>>> connector/format due to maintenance concerns (and
> in
> >>> > > general
> >>> > > > > > >>>>> concern
> >>> > > > > > >>>>>>>>> about
> >>> > > > > > >>>>>>>>>> the design).
> >>> > > > > > >>>>>>>>>> If that's the issue then we should spend more time
> >>> on
> >>> > the
> >>> > > > > > >>>> design
> >>> > > > > > >>>>> to
> >>> > > > > > >>>>>>> get
> >>> > > > > > >>>>>>>>>> comfortable with the approach and seek feedback
> >>> from the
> >>> > > > > > >> wider
> >>> > > > > > >>>>>>>> community
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> I am -1 for the CLI/tooling approach as that will
> >>> not
> >>> > > > provide
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>>> featureset we are looking for that is not already
> >>> > covered
> >>> > > by
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>> Java
> >>> > > > > > >>>>>>>>>> connector. And that approach would come with the
> >>> same
> >>> > > > > > >>>> maintenance
> >>> > > > > > >>>>>>>>>> implications.
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> Cheers
> >>> > > > > > >>>>>>>>>> Gyula
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>> On Wed, Mar 19, 2025 at 11:24 AM Gabor Somogyi <
> >>> > > > > > >>>>>>>>> gabor.g.somo...@gmail.com>
> >>> > > > > > >>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> Hi Zaklely, Shengkai
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> Several topics are going on so adding gist
> answers
> >>> to
> >>> > > them.
> >>> > > > > > >>>> When
> >>> > > > > > >>>>>>> some
> >>> > > > > > >>>>>>>>>> topic
> >>> > > > > > >>>>>>>>>>> is not touched please highlight it.
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> @Shengkai: I've read through all the previous
> FLIPs
> >>> > > related
> >>> > > > > > >>>>>>> catalogs
> >>> > > > > > >>>>>>>>> and
> >>> > > > > > >>>>>>>>>> if
> >>> > > > > > >>>>>>>>>>> we would like to keep the concepts there
> >>> > > > > > >>>>>>>>>>> then one-to-one mapping relationship between
> >>> savepoint
> >>> > > and
> >>> > > > > > >>>>> catalog
> >>> > > > > > >>>>>>>> is a
> >>> > > > > > >>>>>>>>>>> reasonable direction. In short I'm happy that
> >>> > > > > > >>>>>>>>>>> you've highlighted this and agree as a whole.
> I've
> >>> > > written
> >>> > > > > > >> it
> >>> > > > > > >>>>> down
> >>> > > > > > >>>>>>>>>>> previously, just want to double confirm that
> state
> >>> > > catalog
> >>> > > > > > >> is
> >>> > > > > > >>>>>>>>>>> essential and planned. When we reach this point
> >>> then
> >>> > your
> >>> > > > > > >>>> input
> >>> > > > > > >>>>> is
> >>> > > > > > >>>>>>>> more
> >>> > > > > > >>>>>>>>>>> than welcome.
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> @Zakelly: We've tried the CLI and separate
> library
> >>> > > > > > >> approaches
> >>> > > > > > >>>>> with
> >>> > > > > > >>>>>>>>> users
> >>> > > > > > >>>>>>>>>>> already and these are not something which is
> >>> welcome
> >>> > > > > > >> because
> >>> > > > > > >>>> of
> >>> > > > > > >>>>>>> the
> >>> > > > > > >>>>>>>>>>> following:
> >>> > > > > > >>>>>>>>>>> * Users want to have automated tasks and not
> manual
> >>> > > > > > >>>> CLI/library
> >>> > > > > > >>>>>>>> output
> >>> > > > > > >>>>>>>>>>> parsing. This can be hacked around but our
> >>> experience
> >>> > is
> >>> > > > > > >>>>> negative
> >>> > > > > > >>>>>>> on
> >>> > > > > > >>>>>>>>> this
> >>> > > > > > >>>>>>>>>>> because it's just brittle.
> >>> > > > > > >>>>>>>>>>> * From development perspective It's way much
> bigger
> >>> > > effort
> >>> > > > > > >>>> than
> >>> > > > > > >>>>> a
> >>> > > > > > >>>>>>>>>> connector
> >>> > > > > > >>>>>>>>>>> (hard to test, packaging/version handling is and
> >>> extra
> >>> > > > > > >> layer
> >>> > > > > > >>>> of
> >>> > > > > > >>>>>>>>>> complexity,
> >>> > > > > > >>>>>>>>>>> external FS authentication is pain for users,
> >>> expecting
> >>> > > > > > >> them
> >>> > > > > > >>>> to
> >>> > > > > > >>>>>>>>> download
> >>> > > > > > >>>>>>>>>>> savepoints also)
> >>> > > > > > >>>>>>>>>>> * Purely personal opinion but if we would find
> >>> better
> >>> > > ways
> >>> > > > > > >>>> later
> >>> > > > > > >>>>>>> then
> >>> > > > > > >>>>>>>>>>> retire a CLI is not more lightweight than retire
> a
> >>> > > > > > >> connector
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> It would be great if you give some examples on
> how
> >>> > user
> >>> > > > > > >>>> could
> >>> > > > > > >>>>>>>>> leverage
> >>> > > > > > >>>>>>>>>>> the separate connector to process the metadata.
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> The most simplest cases:
> >>> > > > > > >>>>>>>>>>> * give me the overgroving state uids
> >>> > > > > > >>>>>>>>>>> * give me the not known (new or renamed) state
> uids
> >>> > > > > > >>>>>>>>>>> * give me the state uids where state size
> >>> drastically
> >>> > > > > > >> dropped
> >>> > > > > > >>>>>>> compare
> >>> > > > > > >>>>>>>>> to
> >>> > > > > > >>>>>>>>>> a
> >>> > > > > > >>>>>>>>>>> previous savepoint (accidental state loss)
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> Since it was mentioned: as a general offtopic
> >>> teaser,
> >>> > > yeah
> >>> > > > > > >> it
> >>> > > > > > >>>>>>> would
> >>> > > > > > >>>>>>>> be
> >>> > > > > > >>>>>>>>>> good
> >>> > > > > > >>>>>>>>>>> to have some sort of checkpoint/savepoint lineage
> >>> or
> >>> > > > > > >> however
> >>> > > > > > >>>> we
> >>> > > > > > >>>>>>> call
> >>> > > > > > >>>>>>>>> it.
> >>> > > > > > >>>>>>>>>>> Since we've not yet reached this point there are
> no
> >>> > > > > > >> technical
> >>> > > > > > >>>>>>>> details,
> >>> > > > > > >>>>>>>>>> it's
> >>> > > > > > >>>>>>>>>>> more like a vision. It's a common pattern that
> >>> > > > > > >>>>>>>>>>> jobs are physically running but somehow the state
> >>> > > > > > >> processing
> >>> > > > > > >>>> is
> >>> > > > > > >>>>>>> stuck
> >>> > > > > > >>>>>>>>> and
> >>> > > > > > >>>>>>>>>>> it would be good to add some way to find it out
> >>> > > > > > >>>> automatically.
> >>> > > > > > >>>>>>>>>>> The important saying here is automation and not
> >>> manual
> >>> > > > > > >>>>> evaluation
> >>> > > > > > >>>>>>>> since
> >>> > > > > > >>>>>>>>>>> handling 10k+ jobs is just not allowing that.
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>> On Wed, Mar 19, 2025 at 6:46 AM Shengkai Fang <
> >>> > > > > > >>>>> fskm...@gmail.com>
> >>> > > > > > >>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> Hi, All.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> About State Catalog, I want to share more
> thoughts
> >>> > about
> >>> > > > > > >>>> this.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> In the initial design concept, I understood
> that a
> >>> > > > > > >>>> savepoint
> >>> > > > > > >>>>>>> and a
> >>> > > > > > >>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>> catalog have a one-to-one mapping relationship.
> >>> Each
> >>> > > > > > >>>> operator
> >>> > > > > > >>>>>>>>>> corresponds
> >>> > > > > > >>>>>>>>>>>> to a database, and the state of each operator is
> >>> > > > > > >>>> represented
> >>> > > > > > >>>>> as
> >>> > > > > > >>>>>>>>>>> individual
> >>> > > > > > >>>>>>>>>>>> tables. The rationale behind this design is:
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> *State Diversity*: An operator may involve
> >>> multiple
> >>> > > types
> >>> > > > > > >>>> of
> >>> > > > > > >>>>>>>> states.
> >>> > > > > > >>>>>>>>>> For
> >>> > > > > > >>>>>>>>>>>> example, in our VVR design, a "multi-join"
> >>> operator
> >>> > uses
> >>> > > > > > >>>> keyed
> >>> > > > > > >>>>>>>> states
> >>> > > > > > >>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>> two input streams and a broadcast state for the
> >>> third
> >>> > > > > > >>>> stream.
> >>> > > > > > >>>>>>> This
> >>> > > > > > >>>>>>>>>> makes
> >>> > > > > > >>>>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>> challenging to represent all states of an
> operator
> >>> > > > > > >> within a
> >>> > > > > > >>>>>>> single
> >>> > > > > > >>>>>>>>>> table.
> >>> > > > > > >>>>>>>>>>>> *Scalability*: Internally, an operator might
> have
> >>> > > > > > >> multiple
> >>> > > > > > >>>>> keyed
> >>> > > > > > >>>>>>>>> states
> >>> > > > > > >>>>>>>>>>>> (e.g., value state and list state). However,
> large
> >>> > list
> >>> > > > > > >>>> states
> >>> > > > > > >>>>>>> may
> >>> > > > > > >>>>>>>>> not
> >>> > > > > > >>>>>>>>>>> fit
> >>> > > > > > >>>>>>>>>>>> entirely in memory. To address this, we
> recommend
> >>> > > > > > >>>> implementing
> >>> > > > > > >>>>>>> each
> >>> > > > > > >>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>> as a separate table.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> To resolve the loosely coupled relationships
> >>> between
> >>> > > > > > >>>> operator
> >>> > > > > > >>>>>>>> states,
> >>> > > > > > >>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>> propose embedding predefined views within the
> >>> catalog.
> >>> > > > > > >>>> These
> >>> > > > > > >>>>>>> views
> >>> > > > > > >>>>>>>>>>> simplify
> >>> > > > > > >>>>>>>>>>>> user understanding of operator implementations
> and
> >>> > > > > > >> provide
> >>> > > > > > >>>> a
> >>> > > > > > >>>>>>> more
> >>> > > > > > >>>>>>>>>>> intuitive
> >>> > > > > > >>>>>>>>>>>> perspective. For instance, a join operator may
> >>> have
> >>> > > > > > >>>> multiple
> >>> > > > > > >>>>>>> state
> >>> > > > > > >>>>>>>>>>>> implementations (depending on whether the join
> key
> >>> > > > > > >> includes
> >>> > > > > > >>>>>>> unique
> >>> > > > > > >>>>>>>>>>>> attributes), but users primarily care about the
> >>> data
> >>> > > > > > >>>>> associated
> >>> > > > > > >>>>>>>> with
> >>> > > > > > >>>>>>>>> a
> >>> > > > > > >>>>>>>>>>>> specific join key across input streams.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> Returning to the one-to-one mapping between
> >>> savepoints
> >>> > > > > > >> and
> >>> > > > > > >>>>>>>> catalogs,
> >>> > > > > > >>>>>>>>> we
> >>> > > > > > >>>>>>>>>>> aim
> >>> > > > > > >>>>>>>>>>>> to manage multiple user state catalogs through a
> >>> > catalog
> >>> > > > > > >>>>> store.
> >>> > > > > > >>>>>>>> When
> >>> > > > > > >>>>>>>>> a
> >>> > > > > > >>>>>>>>>>> user
> >>> > > > > > >>>>>>>>>>>> triggers a savepoint for a job on the platform:
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> 1. The platform sends a REST request to the
> >>> > JobManager.
> >>> > > > > > >>>>>>>>>>>> 2. Simultaneously, it registers a new state
> >>> catalog in
> >>> > > > > > >> the
> >>> > > > > > >>>>>>> catalog
> >>> > > > > > >>>>>>>>>> store,
> >>> > > > > > >>>>>>>>>>>> enabling immediate analysis of state data on the
> >>> > > > > > >> platform.
> >>> > > > > > >>>>>>>>>>>> 3. Deleting a savepoint would also trigger the
> >>> removal
> >>> > > of
> >>> > > > > > >>>> its
> >>> > > > > > >>>>>>>>>> associated
> >>> > > > > > >>>>>>>>>>>> catalog.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> This vision assumes that states are
> >>> self-describing or
> >>> > > > > > >>>> that a
> >>> > > > > > >>>>>>> state
> >>> > > > > > >>>>>>>>>>>> metaservice is introduced to analyze savepoint
> >>> > > > > > >> structures.
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> How can users create logic to identify
> >>> differences
> >>> > > > > > >>>> between
> >>> > > > > > >>>>>>>> multiple
> >>> > > > > > >>>>>>>>>>>> savepoints?
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> Since savepoints and state catalogs are
> one-to-one
> >>> > > > > > >> mapped,
> >>> > > > > > >>>>> users
> >>> > > > > > >>>>>>>> can
> >>> > > > > > >>>>>>>>>>> query
> >>> > > > > > >>>>>>>>>>>> metadata via their respective catalogs. For
> >>> example:
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> 1.
> >>> > > > > > >>>>>
> >>> `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>`
> >>> > > > > > >>>>>>>>>> provides
> >>> > > > > > >>>>>>>>>>>> operator-specific metadata (e.g., state size,
> >>> type).
> >>> > > > > > >>>>>>>>>>>> 2. Comparing metadata tables (e.g., schema
> >>> versions,
> >>> > > > > > >> state
> >>> > > > > > >>>>> entry
> >>> > > > > > >>>>>>>>>> counts)
> >>> > > > > > >>>>>>>>>>>> across catalogs reveals structural or
> quantitative
> >>> > > > > > >>>>> differences.
> >>> > > > > > >>>>>>>>>>>> 3. For deeper analysis, users could write SQL
> >>> queries
> >>> > to
> >>> > > > > > >>>>> compare
> >>> > > > > > >>>>>>>>>> specific
> >>> > > > > > >>>>>>>>>>>> state partitions or leverage the metaservice to
> >>> track
> >>> > > > > > >> state
> >>> > > > > > >>>>>>>> evolution
> >>> > > > > > >>>>>>>>>>>> (e.g., added/removed operators, modified state
> >>> > > > > > >>>>> configurations).
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> If we plan to introduce a state catalog in the
> >>> > future, I
> >>> > > > > > >>>> would
> >>> > > > > > >>>>>>> lean
> >>> > > > > > >>>>>>>>>>> toward
> >>> > > > > > >>>>>>>>>>>> using metadata tables. If a utility tool can
> >>> address
> >>> > the
> >>> > > > > > >>>>>>> challenges
> >>> > > > > > >>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>> face, could we avoid introducing an additional
> >>> > > connector?
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月17日周一
> >>> > > 20:25写道:
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> Hi All!
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> Without going into too much detail here are my
> 2
> >>> > cents
> >>> > > > > > >>>>>>> regarding
> >>> > > > > > >>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> virtual column / catalog metadata / table
> >>> (connector)
> >>> > > > > > >>>>>>> discussion
> >>> > > > > > >>>>>>>>> for
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> State metadata.
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> State metadata such as the types of states,
> their
> >>> > > > > > >>>>> properties,
> >>> > > > > > >>>>>>>>> names,
> >>> > > > > > >>>>>>>>>>>> sizes
> >>> > > > > > >>>>>>>>>>>>> etc are all valuable information that can be
> >>> used to
> >>> > > > > > >>>> enrich
> >>> > > > > > >>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> computations we do on state.
> >>> > > > > > >>>>>>>>>>>>> We can either analyze it standalone (such as
> >>> discover
> >>> > > > > > >>>>>>> anomalies,
> >>> > > > > > >>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>> large
> >>> > > > > > >>>>>>>>>>>>> jobs with many states), across multiple
> >>> savepoints
> >>> > > > > > >>>> (discover
> >>> > > > > > >>>>>>> how
> >>> > > > > > >>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>> changed over time) or by joining it with keyed
> or
> >>> > > > > > >>>> non-keyed
> >>> > > > > > >>>>>>> state
> >>> > > > > > >>>>>>>>>> data
> >>> > > > > > >>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>> serve more complex queries on the state.
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> The only solution that seems to serve all these
> >>> > > > > > >> use-cases
> >>> > > > > > >>>>> and
> >>> > > > > > >>>>>>>>>>>> requirements
> >>> > > > > > >>>>>>>>>>>>> in a straightforward and SQL canonical way is
> to
> >>> > simply
> >>> > > > > > >>>>> expose
> >>> > > > > > >>>>>>>> the
> >>> > > > > > >>>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>> metadata as a separate table. This is a
> metadata
> >>> > table
> >>> > > > > > >>>> but
> >>> > > > > > >>>>> you
> >>> > > > > > >>>>>>>> can
> >>> > > > > > >>>>>>>>>> also
> >>> > > > > > >>>>>>>>>>>>> think of it as data table, it makes no
> practical
> >>> > > > > > >>>> difference
> >>> > > > > > >>>>>>> here.
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> Once we have a catalog later, the catalog can
> >>> offer
> >>> > > > > > >> this
> >>> > > > > > >>>>> table
> >>> > > > > > >>>>>>>> out
> >>> > > > > > >>>>>>>>> of
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> box, the same way databases provide metadata
> >>> tables.
> >>> > > > > > >> For
> >>> > > > > > >>>>> this
> >>> > > > > > >>>>>>> to
> >>> > > > > > >>>>>>>>> work
> >>> > > > > > >>>>>>>>>>>>> however we need another, simpler connector that
> >>> > creates
> >>> > > > > > >>>> this
> >>> > > > > > >>>>>>>> table.
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> +1 for state metadata as a separate
> >>> connector/table,
> >>> > > > > > >>>> instead
> >>> > > > > > >>>>>>> of
> >>> > > > > > >>>>>>>>>> adding
> >>> > > > > > >>>>>>>>>>>>> virtual columns and adhoc catalog metadata that
> >>> is
> >>> > hard
> >>> > > > > > >>>> to
> >>> > > > > > >>>>> use
> >>> > > > > > >>>>>>>> in a
> >>> > > > > > >>>>>>>>>>> large
> >>> > > > > > >>>>>>>>>>>>> number of queries.
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> Cheers,
> >>> > > > > > >>>>>>>>>>>>> Gyula
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>> On Mon, Mar 17, 2025 at 12:44 PM Gabor Somogyi
> <
> >>> > > > > > >>>>>>>>>>>> gabor.g.somo...@gmail.com>
> >>> > > > > > >>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>> I’m planning on adding this, and we may
> >>> collaborate
> >>> > > > > > >>>> on
> >>> > > > > > >>>>> it
> >>> > > > > > >>>>>>> in
> >>> > > > > > >>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> future.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> +1 on this, just ping me.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> After some code digging and POC all I can say
> >>> that
> >>> > > > > > >> with
> >>> > > > > > >>>>>>> heavy
> >>> > > > > > >>>>>>>>>> effort
> >>> > > > > > >>>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>> maybe add such changes that we're able to show
> >>> > > > > > >> metadata
> >>> > > > > > >>>>> of a
> >>> > > > > > >>>>>>>>>>> savepoint
> >>> > > > > > >>>>>>>>>>>>> from
> >>> > > > > > >>>>>>>>>>>>>> catalog.
> >>> > > > > > >>>>>>>>>>>>>> I'm not against that but from user perspective
> >>> this
> >>> > > > > > >> has
> >>> > > > > > >>>>>>> limited
> >>> > > > > > >>>>>>>>>>> value,
> >>> > > > > > >>>>>>>>>>>>> let
> >>> > > > > > >>>>>>>>>>>>>> me explain why.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> From high level perspective I see the
> following
> >>> > > > > > >> which I
> >>> > > > > > >>>>> see
> >>> > > > > > >>>>>>>>>> agreement
> >>> > > > > > >>>>>>>>>>>> on:
> >>> > > > > > >>>>>>>>>>>>>> * We should have a catalog which is
> >>> representing one
> >>> > > > > > >> or
> >>> > > > > > >>>>> more
> >>> > > > > > >>>>>>>> jobs
> >>> > > > > > >>>>>>>>>>>>> savepoint
> >>> > > > > > >>>>>>>>>>>>>> data set (future plan)
> >>> > > > > > >>>>>>>>>>>>>> * Savepoints should be able to be registered
> in
> >>> the
> >>> > > > > > >>>>> catalog
> >>> > > > > > >>>>>>>> which
> >>> > > > > > >>>>>>>>>> are
> >>> > > > > > >>>>>>>>>>>>> then
> >>> > > > > > >>>>>>>>>>>>>> databases (future plan)
> >>> > > > > > >>>>>>>>>>>>>> * There must be a possiblity to create tables
> >>> from
> >>> > > > > > >>>>> databases
> >>> > > > > > >>>>>>>>> where
> >>> > > > > > >>>>>>>>>>>> users
> >>> > > > > > >>>>>>>>>>>>>> can read state data (exists already)
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> In terms of metadata, If I understand
> correctly
> >>> then
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>> suggested
> >>> > > > > > >>>>>>>>>>>>> approach
> >>> > > > > > >>>>>>>>>>>>>> would be to access
> >>> > > > > > >>>>>>>>>>>>>> it from the catalog describe command, right?
> >>> Adding
> >>> > > > > > >>>> that
> >>> > > > > > >>>>>>> info
> >>> > > > > > >>>>>>>>> when
> >>> > > > > > >>>>>>>>>>>>> specific
> >>> > > > > > >>>>>>>>>>>>>> database describe command
> >>> > > > > > >>>>>>>>>>>>>> is executed could be done.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> The question is for instance how can users
> >>> create
> >>> > > > > > >> such
> >>> > > > > > >>>> a
> >>> > > > > > >>>>>>> logic
> >>> > > > > > >>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>> tells
> >>> > > > > > >>>>>>>>>>>>>> them what is
> >>> > > > > > >>>>>>>>>>>>>> the difference between multiple savepoints?
> >>> > > > > > >>>>>>>>>>>>>> Just to give some examples:
> >>> > > > > > >>>>>>>>>>>>>> * per operator size changes between savepoints
> >>> > > > > > >>>>>>>>>>>>>> * show values from operator data where state
> >>> size
> >>> > > > > > >>>> reaches
> >>> > > > > > >>>>> a
> >>> > > > > > >>>>>>>>>> boundary
> >>> > > > > > >>>>>>>>>>>>>> * in general "find which checkpoint ruined
> >>> things"
> >>> > is
> >>> > > > > > >>>>> quite
> >>> > > > > > >>>>>>>>> common
> >>> > > > > > >>>>>>>>>>>>> pattern
> >>> > > > > > >>>>>>>>>>>>>> What I would like to highlight here is that
> from
> >>> > > > > > >> Flink
> >>> > > > > > >>>>>>> point of
> >>> > > > > > >>>>>>>>>> view
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> metadata can be
> >>> > > > > > >>>>>>>>>>>>>> considered as a static side output information
> >>> but
> >>> > > > > > >> for
> >>> > > > > > >>>>> users
> >>> > > > > > >>>>>>>>> these
> >>> > > > > > >>>>>>>>>>>> values
> >>> > > > > > >>>>>>>>>>>>>> are actual real data
> >>> > > > > > >>>>>>>>>>>>>> where logic is planned to build around.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>> The metadata is more like one-time
> information
> >>> > > > > > >>>> instead
> >>> > > > > > >>>>> of
> >>> > > > > > >>>>>>> a
> >>> > > > > > >>>>>>>>>>> streaming
> >>> > > > > > >>>>>>>>>>>>>> data that changes all
> >>> > > > > > >>>>>>>>>>>>>> the time, so a single connector seems to be an
> >>> > > > > > >>>> overkill.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> State data is also static within a savepoint
> and
> >>> > > > > > >> that's
> >>> > > > > > >>>>> the
> >>> > > > > > >>>>>>>>> reason
> >>> > > > > > >>>>>>>>>>> why
> >>> > > > > > >>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> state processor API is working in batch mode.
> >>> > > > > > >>>>>>>>>>>>>> When we handle multiple checkpoints in a
> >>> streaming
> >>> > > > > > >>>> fashion
> >>> > > > > > >>>>>>> then
> >>> > > > > > >>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>> viewed from another angle.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> We can come up with more lightweight solution
> >>> other
> >>> > > > > > >>>> than a
> >>> > > > > > >>>>>>> new
> >>> > > > > > >>>>>>>>>>>> connector
> >>> > > > > > >>>>>>>>>>>>>> but enforcing users to parse the catalog
> >>> > > > > > >>>>>>>>>>>>>> describe command output in order to compare
> >>> multiple
> >>> > > > > > >>>>>>> savepoints
> >>> > > > > > >>>>>>>>>>> doesn't
> >>> > > > > > >>>>>>>>>>>>>> sound smooth user experience.
> >>> > > > > > >>>>>>>>>>>>>> Honestly I've no other idea how exposing
> >>> metadata as
> >>> > > > > > >>>> real
> >>> > > > > > >>>>>>> user
> >>> > > > > > >>>>>>>>> data
> >>> > > > > > >>>>>>>>>>> so
> >>> > > > > > >>>>>>>>>>>>>> waiting on other approaches.
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> On Thu, Mar 13, 2025 at 2:44 AM Shengkai Fang
> <
> >>> > > > > > >>>>>>>> fskm...@gmail.com
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>> Looking forward to hearing the good news!
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>> Gabor Somogyi <gabor.g.somo...@gmail.com>
> >>> > > > > > >>>> 于2025年3月12日周三
> >>> > > > > > >>>>>>>>> 22:24写道:
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>> Thanks for both the valuable input!
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>> Let me take a closer look at the
> suggestions,
> >>> > > > > > >> like
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>> Catalog
> >>> > > > > > >>>>>>>>>>>>>>> capabilities
> >>> > > > > > >>>>>>>>>>>>>>>> and possibility of embedding TypeInformation
> >>> or
> >>> > > > > > >>>>>>>>>>>>>>>> StateDescriptor metadata directly into the
> raw
> >>> > > > > > >>>> state
> >>> > > > > > >>>>>>>> files...
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 8:17 AM Shengkai
> Fang
> >>> <
> >>> > > > > > >>>>>>>>>> fskm...@gmail.com
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> Thanks for Zakelly's clarification.
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> +1 to delay the discussion about this.
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> I’d like to share my perspective on the
> State
> >>> > > > > > >>>>> Catalog
> >>> > > > > > >>>>>>>>>> proposal.
> >>> > > > > > >>>>>>>>>>>>> While
> >>> > > > > > >>>>>>>>>>>>>>>>> introducing this capability is beneficial,
> >>> > > > > > >> there
> >>> > > > > > >>>> is
> >>> > > > > > >>>>> a
> >>> > > > > > >>>>>>>>>> blocker:
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>> current
> >>> > > > > > >>>>>>>>>>>>>>>>> StateBackend architecture does not permit
> >>> > > > > > >>>> operators
> >>> > > > > > >>>>> to
> >>> > > > > > >>>>>>>>> encode
> >>> > > > > > >>>>>>>>>>>>>>>>> TypeInformation into the state—it only
> >>> > > > > > >> preserves
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>>>> Serializer.
> >>> > > > > > >>>>>>>>>>>>> This
> >>> > > > > > >>>>>>>>>>>>>>>>> limitation creates an asymmetry, as
> operators
> >>> > > > > > >>>> alone
> >>> > > > > > >>>>>>>> retain
> >>> > > > > > >>>>>>>>>>>>> knowledge
> >>> > > > > > >>>>>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>> data structure’s schema.
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> To address this, I suggest allowing
> operators
> >>> > > > > > >> to
> >>> > > > > > >>>>> embed
> >>> > > > > > >>>>>>>>>>>>>> TypeInformation
> >>> > > > > > >>>>>>>>>>>>>>> or
> >>> > > > > > >>>>>>>>>>>>>>>>> StateDescriptor metadata directly into the
> >>> raw
> >>> > > > > > >>>> state
> >>> > > > > > >>>>>>>> files.
> >>> > > > > > >>>>>>>>>>> Such
> >>> > > > > > >>>>>>>>>>>> a
> >>> > > > > > >>>>>>>>>>>>>>> design
> >>> > > > > > >>>>>>>>>>>>>>>>> would enable the Catalog to:
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> 1. Parse state files and programmatically
> >>> > > > > > >> derive
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>> schema
> >>> > > > > > >>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>> structural
> >>> > > > > > >>>>>>>>>>>>>>>>> guarantees for each state.
> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Leverage existing Flink Table utilities,
> >>> > > > > > >> such
> >>> > > > > > >>>> as
> >>> > > > > > >>>>>>>>>>>>>>>>> LegacyTypeInfoDataTypeConverter (in
> >>> > > > > > >>>>>>>>>>>>>>> org.apache.flink.table.types.utils),
> >>> > > > > > >>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>> bridge TypeInformation and DataType
> >>> > > > > > >> conversions.
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> If we can not store the TypeInformation or
> >>> > > > > > >>>>>>>> StateDescriptor
> >>> > > > > > >>>>>>>>>> into
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> raw
> >>> > > > > > >>>>>>>>>>>>>>>>> state files, I am +1 for this FLIP to use
> >>> > > > > > >>>> metadata
> >>> > > > > > >>>>>>> column
> >>> > > > > > >>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>> retrieve
> >>> > > > > > >>>>>>>>>>>>>>>>> information.
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>> Zakelly Lan <zakelly....@gmail.com>
> >>> > > > > > >>>> 于2025年3月12日周三
> >>> > > > > > >>>>>>>> 12:43写道:
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> Hi Gabor and Shengkai,
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> Thanks for sharing your thoughts! This is
> a
> >>> > > > > > >>>> long
> >>> > > > > > >>>>>>>>> discussion
> >>> > > > > > >>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>> sorry
> >>> > > > > > >>>>>>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>>>>> the late reply (I'm busy catching up with
> >>> > > > > > >>>> release
> >>> > > > > > >>>>>>> 2.0
> >>> > > > > > >>>>>>>>> these
> >>> > > > > > >>>>>>>>>>>>> days).
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> Let me first clarify your thoughts to
> ensure
> >>> > > > > > >> I
> >>> > > > > > >>>>>>>> understand
> >>> > > > > > >>>>>>>>>>>>>> correctly.
> >>> > > > > > >>>>>>>>>>>>>>>>> IIUC,
> >>> > > > > > >>>>>>>>>>>>>>>>>> there is no persistent configuration for
> >>> > > > > > >> state
> >>> > > > > > >>>> TTL
> >>> > > > > > >>>>>>> in
> >>> > > > > > >>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> checkpoint.
> >>> > > > > > >>>>>>>>>>>>>>>>> While
> >>> > > > > > >>>>>>>>>>>>>>>>>> you can infer that TTL is enabled by
> reading
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>>> serializer,
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>> checkpoint
> >>> > > > > > >>>>>>>>>>>>>>>>>> itself only stores the last access time
> for
> >>> > > > > > >>>> each
> >>> > > > > > >>>>>>> value.
> >>> > > > > > >>>>>>>>> So
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> only
> >>> > > > > > >>>>>>>>>>>>>>>> thing
> >>> > > > > > >>>>>>>>>>>>>>>>>> we can show is the last access time for
> each
> >>> > > > > > >>>>> value.
> >>> > > > > > >>>>>>> But
> >>> > > > > > >>>>>>>>> it
> >>> > > > > > >>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>> not
> >>> > > > > > >>>>>>>>>>>>>>>>> required
> >>> > > > > > >>>>>>>>>>>>>>>>>> for all state backends to store this, as
> >>> they
> >>> > > > > > >>>> may
> >>> > > > > > >>>>>>>>> directly
> >>> > > > > > >>>>>>>>>>>> store
> >>> > > > > > >>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>> expired time. This will also increase the
> >>> > > > > > >>>>>>> difficulty of
> >>> > > > > > >>>>>>>>>>>>>>> implementation
> >>> > > > > > >>>>>>>>>>>>>>>> &
> >>> > > > > > >>>>>>>>>>>>>>>>>> maintenance.
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> This once again reiterates the importance
> of
> >>> > > > > > >>>>> unified
> >>> > > > > > >>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>>>>> checkpoints. I’m planning on adding this,
> >>> and
> >>> > > > > > >>>> we
> >>> > > > > > >>>>> may
> >>> > > > > > >>>>>>>>>>>> collaborate
> >>> > > > > > >>>>>>>>>>>>> on
> >>> > > > > > >>>>>>>>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>> the future.
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> I'm not in favor of adding a new connector
> >>> > > > > > >> for
> >>> > > > > > >>>>>>>> metadata.
> >>> > > > > > >>>>>>>>>> The
> >>> > > > > > >>>>>>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>>>>>>>> more like one-time information instead of
> a
> >>> > > > > > >>>>>>> streaming
> >>> > > > > > >>>>>>>>> data
> >>> > > > > > >>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>> changes
> >>> > > > > > >>>>>>>>>>>>>>>>> all
> >>> > > > > > >>>>>>>>>>>>>>>>>> the time, so a single connector seems to
> be
> >>> > > > > > >> an
> >>> > > > > > >>>>>>>> overkill.
> >>> > > > > > >>>>>>>>> It
> >>> > > > > > >>>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>>> not
> >>> > > > > > >>>>>>>>>>>>>>> easy
> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>> withdraw a connector if we have a better
> >>> > > > > > >>>> solution
> >>> > > > > > >>>>> in
> >>> > > > > > >>>>>>>>>> future.
> >>> > > > > > >>>>>>>>>>>> I'm
> >>> > > > > > >>>>>>>>>>>>>> not
> >>> > > > > > >>>>>>>>>>>>>>>>>> familiar with current Catalog
> capabilities,
> >>> > > > > > >>>> and if
> >>> > > > > > >>>>>>> it
> >>> > > > > > >>>>>>>>> could
> >>> > > > > > >>>>>>>>>>>>> extract
> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>> show some operator-level information from
> >>> > > > > > >>>>> savepoint,
> >>> > > > > > >>>>>>>> that
> >>> > > > > > >>>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>> great.
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> If the Catalog can't do that, I would
> >>> > > > > > >> consider
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>> current
> >>> > > > > > >>>>>>>>>>> FLIP
> >>> > > > > > >>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>> be a
> >>> > > > > > >>>>>>>>>>>>>>>>>> compromise solution.
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> And if we have that unified metadata for
> >>> > > > > > >>>>>>>>>> checkpoint/savepoint
> >>> > > > > > >>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>> future,
> >>> > > > > > >>>>>>>>>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>>>>>>>> may directly register savepoint in
> catalog,
> >>> > > > > > >> and
> >>> > > > > > >>>>>>> create
> >>> > > > > > >>>>>>>> a
> >>> > > > > > >>>>>>>>>>> source
> >>> > > > > > >>>>>>>>>>>>>>> without
> >>> > > > > > >>>>>>>>>>>>>>>>>> specifying complex columns, as well as
> >>> > > > > > >> describe
> >>> > > > > > >>>>> the
> >>> > > > > > >>>>>>>>>> savepoint
> >>> > > > > > >>>>>>>>>>>>>> catalog
> >>> > > > > > >>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>> get the metadata. That's a good solution
> in
> >>> > > > > > >> my
> >>> > > > > > >>>>> mind.
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>>> Zakelly
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 10:35 AM Shengkai
> >>> > > > > > >> Fang
> >>> > > > > > >>>> <
> >>> > > > > > >>>>>>>>>>>>> fskm...@gmail.com>
> >>> > > > > > >>>>>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> > > > > > >>>>>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> I would argue against introducing a new
> >>> > > > > > >>>>> connector
> >>> > > > > > >>>>>>>> type
> >>> > > > > > >>>>>>>>>>> named
> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata, as the existing
> Catalog
> >>> > > > > > >>>>>>> mechanism
> >>> > > > > > >>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>> inherently
> >>> > > > > > >>>>>>>>>>>>>>>>>>> provide the necessary connector factory
> >>> > > > > > >>>>>>> capabilities.
> >>> > > > > > >>>>>>>>>> I’ve
> >>> > > > > > >>>>>>>>>>>>>> detailed
> >>> > > > > > >>>>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposal in branch[1]. Please take a
> moment
> >>> > > > > > >>>> to
> >>> > > > > > >>>>>>> review
> >>> > > > > > >>>>>>>>> it.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> If we introduce a connector named
> >>> > > > > > >>>>>>>> `savepoint-metadata`,
> >>> > > > > > >>>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>>> means
> >>> > > > > > >>>>>>>>>>>>>>> user
> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>>> create a temporary table with connector
> >>> > > > > > >>>>>>>>>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>> connector needs to check whether table
> >>> > > > > > >>>> schema is
> >>> > > > > > >>>>>>> same
> >>> > > > > > >>>>>>>>> to
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> schema
> >>> > > > > > >>>>>>>>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposed in the FLIP. On the other hand,
> >>> > > > > > >> it's
> >>> > > > > > >>>>> not
> >>> > > > > > >>>>>>>> easy
> >>> > > > > > >>>>>>>>>> work
> >>> > > > > > >>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>>> others
> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>> users a metadata table with same schema.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>> Gabor Somogyi <gabor.g.somo...@gmail.com
> >
> >>> > > > > > >>>>>>>>> 于2025年3月11日周二
> >>> > > > > > >>>>>>>>>>>>> 16:56写道:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> From directional perspective I agree
> your
> >>> > > > > > >>>> idea
> >>> > > > > > >>>>>>> how
> >>> > > > > > >>>>>>>> it
> >>> > > > > > >>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>>>> implemented.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Previously I've mentioned that TTL
> >>> > > > > > >>>> information
> >>> > > > > > >>>>>>> is
> >>> > > > > > >>>>>>>> not
> >>> > > > > > >>>>>>>>>>>> exposed
> >>> > > > > > >>>>>>>>>>>>>> on
> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> processor API (which the SQL state
> >>> > > > > > >>>> connector
> >>> > > > > > >>>>>>> uses
> >>> > > > > > >>>>>>>> to
> >>> > > > > > >>>>>>>>>> read
> >>> > > > > > >>>>>>>>>>>>> data)
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> and unless somebody show me the opposite
> >>> > > > > > >>>> this
> >>> > > > > > >>>>>>> FLIP
> >>> > > > > > >>>>>>>> is
> >>> > > > > > >>>>>>>>>> not
> >>> > > > > > >>>>>>>>>>>>> going
> >>> > > > > > >>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>> address
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> this to avoid feature creep. Our users
> >>> > > > > > >> are
> >>> > > > > > >>>>> also
> >>> > > > > > >>>>>>>>>>> interested
> >>> > > > > > >>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>> TTL
> >>> > > > > > >>>>>>>>>>>>>>>> so
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sooner or later we're going to expose
> it,
> >>> > > > > > >>>> this
> >>> > > > > > >>>>>>> is
> >>> > > > > > >>>>>>>>>> matter
> >>> > > > > > >>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>>> scheduling.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> > > > > > >>>>>>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Not sure I understand your point at all
> >>> > > > > > >>>>> related
> >>> > > > > > >>>>>>>>>>>> StateCatalog.
> >>> > > > > > >>>>>>>>>>>>>>> First
> >>> > > > > > >>>>>>>>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>>>>> all
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I can't agree more that StateCatalog is
> >>> > > > > > >>>> needed
> >>> > > > > > >>>>>>> and
> >>> > > > > > >>>>>>>>> is a
> >>> > > > > > >>>>>>>>>>>>> planned
> >>> > > > > > >>>>>>>>>>>>>>>>>> building
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> block in an upcoming
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> FLIP but not sure how can it help now?
> No
> >>> > > > > > >>>>> matter
> >>> > > > > > >>>>>>>>> what,
> >>> > > > > > >>>>>>>>>>> your
> >>> > > > > > >>>>>>>>>>>>>>>> knowledge
> >>> > > > > > >>>>>>>>>>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> essential when we add StateCatalog. Let
> >>> > > > > > >> me
> >>> > > > > > >>>>>>> expose
> >>> > > > > > >>>>>>>> my
> >>> > > > > > >>>>>>>>>>>>>>> understanding
> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> area:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * First we need create table statements
> >>> > > > > > >> to
> >>> > > > > > >>>>>>> access
> >>> > > > > > >>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>> data
> >>> > > > > > >>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * When we have that then we can add
> >>> > > > > > >>>>> StateCatalog
> >>> > > > > > >>>>>>>>> which
> >>> > > > > > >>>>>>>>>>>> could
> >>> > > > > > >>>>>>>>>>>>>>>>>> potentially
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> ease the life of users by for ex. giving
> >>> > > > > > >>>>>>>>> off-the-shelf
> >>> > > > > > >>>>>>>>>>>> tables
> >>> > > > > > >>>>>>>>>>>>>>>> without
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sweating with create table statements
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> User expectations:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See state data (this is fulfilled with
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>> existing
> >>> > > > > > >>>>>>>>>>>>>> connector)
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about state data like TTL
> >>> > > > > > >>>> (this
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>> be
> >>> > > > > > >>>>>>>>>>> added
> >>> > > > > > >>>>>>>>>>>>> as
> >>> > > > > > >>>>>>>>>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> column as you suggested since it belongs
> >>> > > > > > >> to
> >>> > > > > > >>>>> the
> >>> > > > > > >>>>>>>> data)
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about operators (this can
> >>> > > > > > >> be
> >>> > > > > > >>>>>>> added
> >>> > > > > > >>>>>>>>> from
> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata)
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Important to highlight that state data
> >>> > > > > > >>>> table
> >>> > > > > > >>>>>>> format
> >>> > > > > > >>>>>>>>>>> differs
> >>> > > > > > >>>>>>>>>>>>>> from
> >>> > > > > > >>>>>>>>>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata table format. Namely one table
> >>> > > > > > >> has
> >>> > > > > > >>>>> rows
> >>> > > > > > >>>>>>>> for
> >>> > > > > > >>>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>>> values
> >>> > > > > > >>>>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> another has rows for operators, right?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I think that's the reason why you've
> >>> > > > > > >>>>> pinpointed
> >>> > > > > > >>>>>>> out
> >>> > > > > > >>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>> suggested
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata columns are somewhat clunky.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> As a conclusion I agree to add
> >>> > > > > > >>>>> ${state-name}_ttl
> >>> > > > > > >>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>> column
> >>> > > > > > >>>>>>>>>>>>>>>>> later
> >>> > > > > > >>>>>>>>>>>>>>>>>> on
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> since it belongs to the state value and
> >>> > > > > > >>>>> adding a
> >>> > > > > > >>>>>>>> new
> >>> > > > > > >>>>>>>>>>> table
> >>> > > > > > >>>>>>>>>>>>> type
> >>> > > > > > >>>>>>>>>>>>>>>> (like
> >>> > > > > > >>>>>>>>>>>>>>>>>> you
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> suggested similar to PG [1])
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for metadata. Please see how Spark does
> >>> > > > > > >>>> that
> >>> > > > > > >>>>> too
> >>> > > > > > >>>>>>>> [2].
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> If you have better approach then please
> >>> > > > > > >>>>>>> elaborate
> >>> > > > > > >>>>>>>>> with
> >>> > > > > > >>>>>>>>>>> more
> >>> > > > > > >>>>>>>>>>>>>>> details
> >>> > > > > > >>>>>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> help me to understand your point.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> > > > > > >>>>> savepoints
> >>> > > > > > >>>>>>>> that
> >>> > > > > > >>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>> number
> >>> > > > > > >>>>>>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>>> keys
> >>> > > > > > >>>>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>> > > > > > >>>> state
> >>> > > > > > >>>>>>>> itself.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> But again, this is a good feature as-is
> >>> > > > > > >>>> and
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>> be
> >>> > > > > > >>>>>>>>>>>> handled
> >>> > > > > > >>>>>>>>>>>>>> in a
> >>> > > > > > >>>>>>>>>>>>>>>>>>> separate
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> jira.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I've just created
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >> https://issues.apache.org/jira/browse/FLINK-37456.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>
> >>> https://www.postgresql.org/docs/current/view-pg-tables.html
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [2]
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> On Tue, Mar 11, 2025 at 3:55 AM Shengkai
> >>> > > > > > >>>> Fang
> >>> > > > > > >>>>> <
> >>> > > > > > >>>>>>>>>>>>>> fskm...@gmail.com
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your response.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Thank you for addressing the
> >>> > > > > > >> limitations
> >>> > > > > > >>>>> here.
> >>> > > > > > >>>>>>>>>>> However, I
> >>> > > > > > >>>>>>>>>>>>>>> believe
> >>> > > > > > >>>>>>>>>>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be beneficial to further clarify the
> >>> > > > > > >> API
> >>> > > > > > >>>> in
> >>> > > > > > >>>>>>> this
> >>> > > > > > >>>>>>>>> FLIP
> >>> > > > > > >>>>>>>>>>>>>> regarding
> >>> > > > > > >>>>>>>>>>>>>>>> how
> >>> > > > > > >>>>>>>>>>>>>>>>>>> users
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> can specify the TTL column.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> One potential approach that comes to
> >>> > > > > > >>>> mind is
> >>> > > > > > >>>>>>>> using
> >>> > > > > > >>>>>>>>> a
> >>> > > > > > >>>>>>>>>>>>>>> standardized
> >>> > > > > > >>>>>>>>>>>>>>>>>>> naming
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> convention such as ${state-name}_ttl
> >>> > > > > > >> for
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>> column
> >>> > > > > > >>>>>>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>>>>>> defines
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the TTL value. In terms of
> >>> > > > > > >>>> implementation,
> >>> > > > > > >>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>> listReadableMetadata
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> function could:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Read the table’s columns and
> >>> > > > > > >>>>> configuration,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Extract all defined state names, and
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 3. Return a structured list of metadata
> >>> > > > > > >>>>>>> entries
> >>> > > > > > >>>>>>>>>>> formatted
> >>> > > > > > >>>>>>>>>>>>> as
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> ${state-name}_ttl.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> WDYT?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> > > > > > >>>>>>>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Introducing a new connector type at
> >>> > > > > > >> this
> >>> > > > > > >>>>> stage
> >>> > > > > > >>>>>>>> may
> >>> > > > > > >>>>>>>>>>>>>>> unnecessarily
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> complicate
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the system. Given that every table
> >>> > > > > > >>>> already
> >>> > > > > > >>>>>>>> belongs
> >>> > > > > > >>>>>>>>>> to a
> >>> > > > > > >>>>>>>>>>>>>>> Catalog,
> >>> > > > > > >>>>>>>>>>>>>>>>>> which
> >>> > > > > > >>>>>>>>>>>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> designed to provide a Factory for
> >>> > > > > > >>>> building
> >>> > > > > > >>>>>>> source
> >>> > > > > > >>>>>>>>> or
> >>> > > > > > >>>>>>>>>>> sink
> >>> > > > > > >>>>>>>>>>>>>>>>>> connectors, I
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> propose integrating a dedicated
> >>> > > > > > >>>> StateCatalog
> >>> > > > > > >>>>>>>>> instead.
> >>> > > > > > >>>>>>>>>>>> This
> >>> > > > > > >>>>>>>>>>>>>>>> approach
> >>> > > > > > >>>>>>>>>>>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> allow us to:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Leverage the Catalog’s existing
> >>> > > > > > >>>>>>> capabilities
> >>> > > > > > >>>>>>>> to
> >>> > > > > > >>>>>>>>>>> manage
> >>> > > > > > >>>>>>>>>>>>> TTL
> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> (e.g., state names and TTL logic)
> >>> > > > > > >> without
> >>> > > > > > >>>>>>>>> duplicating
> >>> > > > > > >>>>>>>>>>>>>>>>> functionality.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Provide a unified interface for
> >>> > > > > > >>>> connector
> >>> > > > > > >>>>>>>>>>>> instantiation
> >>> > > > > > >>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handling through the Catalog’s Factory
> >>> > > > > > >>>>>>> pattern.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Would this design decision better align
> >>> > > > > > >>>> with
> >>> > > > > > >>>>>>> our
> >>> > > > > > >>>>>>>>>>>>>> architecture’s
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> extensibility and reduce redundancy?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> > > > > > >>>>>>> savepoints
> >>> > > > > > >>>>>>>>> that
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> number
> >>> > > > > > >>>>>>>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>> > > > > > >>>>> state
> >>> > > > > > >>>>>>>>> itself.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >>> > > > > > >> as-is
> >>> > > > > > >>>>> and
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>> handled
> >>> > > > > > >>>>>>>>>>>>>>> in a
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> +1 for a separate jira.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>> > > > > > >> gabor.g.somo...@gmail.com
> >>> > > > > > >>>>>
> >>> > > > > > >>>>>>>>>>> 于2025年3月10日周一
> >>> > > > > > >>>>>>>>>>>>>>> 19:05写道:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Please see my comments inline.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 3, 2025 at 7:07 AM
> >>> > > > > > >> Shengkai
> >>> > > > > > >>>>>>> Fang <
> >>> > > > > > >>>>>>>>>>>>>>>> fskm...@gmail.com>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your the
> >>> > > > > > >> FLIP.
> >>> > > > > > >>>> I
> >>> > > > > > >>>>>>> have
> >>> > > > > > >>>>>>>>> some
> >>> > > > > > >>>>>>>>>>>>>> questions
> >>> > > > > > >>>>>>>>>>>>>>>>> about
> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> FLIP:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> How can users retrieve the state
> >>> > > > > > >> TTL
> >>> > > > > > >>>>>>>>>> (Time-to-Live)
> >>> > > > > > >>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>> each
> >>> > > > > > >>>>>>>>>>>>>>>>>> value
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> column?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> From my understanding of the
> >>> > > > > > >> current
> >>> > > > > > >>>>>>> design,
> >>> > > > > > >>>>>>>> it
> >>> > > > > > >>>>>>>>>>> seems
> >>> > > > > > >>>>>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> functionality is not supported.
> >>> > > > > > >> Could
> >>> > > > > > >>>>> you
> >>> > > > > > >>>>>>>>> clarify
> >>> > > > > > >>>>>>>>>>> if
> >>> > > > > > >>>>>>>>>>>>>> there
> >>> > > > > > >>>>>>>>>>>>>>>> are
> >>> > > > > > >>>>>>>>>>>>>>>>>>> plans
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> address this limitation?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Since the state processor API is not
> >>> > > > > > >>>> yet
> >>> > > > > > >>>>>>>> exposing
> >>> > > > > > >>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>> information
> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> would require several steps.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> First, the state processor API
> >>> > > > > > >> support
> >>> > > > > > >>>>>>> needs to
> >>> > > > > > >>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>> added
> >>> > > > > > >>>>>>>>>>>>>>> which
> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> then
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exposed on the SQL API.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This is definitely a future
> >>> > > > > > >> improvement
> >>> > > > > > >>>>>>> which
> >>> > > > > > >>>>>>>> is
> >>> > > > > > >>>>>>>>>>> useful
> >>> > > > > > >>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handled
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> in a separate jira.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata
> >>> > > > > > >> Column
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> The metadata information described
> >>> > > > > > >> in
> >>> > > > > > >>>>> the
> >>> > > > > > >>>>>>>> FLIP
> >>> > > > > > >>>>>>>>>>>> appears
> >>> > > > > > >>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>>>>> intended
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> describe the state files stored at
> >>> > > > > > >> a
> >>> > > > > > >>>>>>> specific
> >>> > > > > > >>>>>>>>>>>> location.
> >>> > > > > > >>>>>>>>>>>>>> To
> >>> > > > > > >>>>>>>>>>>>>>>> me,
> >>> > > > > > >>>>>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> concept
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> aligns more closely with system
> >>> > > > > > >>>> tables
> >>> > > > > > >>>>>>> like
> >>> > > > > > >>>>>>>>>>> pg_tables
> >>> > > > > > >>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>> PostgreSQL
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> or
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> the INFORMATION_SCHEMA in MySQL
> >>> > > > > > >> [2].
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Adding a new connector with
> >>> > > > > > >>>>>>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>> is a
> >>> > > > > > >>>>>>>>>>>>>>>> possibility
> >>> > > > > > >>>>>>>>>>>>>>>>>>> where
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> can create such functionality.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> I'm not against that, just want to
> >>> > > > > > >>>> have a
> >>> > > > > > >>>>>>>> common
> >>> > > > > > >>>>>>>>>>>>> agreement
> >>> > > > > > >>>>>>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>>>> we
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> like to move that direction.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (As a side note not just PG but Spark
> >>> > > > > > >>>> also
> >>> > > > > > >>>>>>> has
> >>> > > > > > >>>>>>>>>>> similar
> >>> > > > > > >>>>>>>>>>>>>>> approach
> >>> > > > > > >>>>>>>>>>>>>>>>>> and I
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> basically like the idea).
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would go that direction
> >>> > > > > > >> savepoint
> >>> > > > > > >>>>>>>> metadata
> >>> > > > > > >>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>> reached
> >>> > > > > > >>>>>>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>> a
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> way
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> that one row would represent
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> an operator with it's values
> >>> > > > > > >> something
> >>> > > > > > >>>>> like
> >>> > > > > > >>>>>>>> this:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ame      │id       │ash      │sm
> >>> > > > > > >>>>>>> │elism
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │atesCount│orStateSi│tesSizeI│
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │         │
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │zeInBytes│nBytes  │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │Source:  │datagen-s│47aee9439│2
> >>> > > > > > >>>>> │128
> >>> > > > > > >>>>>>>>>> │2
> >>> > > > > > >>>>>>>>>>>>>>> │16
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │546     │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │datagen-s│ource-uid│4d6ea26e2│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ource    │         │d544bef0a│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │37bb5    │
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │long-udf-│long-udf-│6ed3f40bf│2
> >>> > > > > > >>>>> │128
> >>> > > > > > >>>>>>>>>> │2
> >>> > > > > > >>>>>>>>>>>>>>> │0
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> │0
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>     │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │with-mast│with-mast│f3c8dfcdf│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │er-hook  │er-hook-u│cb95128a1│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │id       │018f1    │
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │value-pro│value-pro│ca4f5fe9a│2
> >>> > > > > > >>>>> │128
> >>> > > > > > >>>>>>>>>> │2
> >>> > > > > > >>>>>>>>>>>>>>> │0
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │40726   │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │cess     │cess-uid │637b656f0│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │9ea78b3e7│
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │a15b9    │
> >>> > > > > > >>>> │
> >>> > > > > > >>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This table can then be joined with
> >>> > > > > > >> the
> >>> > > > > > >>>>>>> actually
> >>> > > > > > >>>>>>>>>>>> existing
> >>> > > > > > >>>>>>>>>>>>>>>>>> `savepoint`
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> connector created tables based on UID
> >>> > > > > > >>>> hash
> >>> > > > > > >>>>>>>> (which
> >>> > > > > > >>>>>>>>>> is
> >>> > > > > > >>>>>>>>>>>>> unique
> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>>> always
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exists).
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This would mean that the already
> >>> > > > > > >>>> existing
> >>> > > > > > >>>>>>> table
> >>> > > > > > >>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>> need
> >>> > > > > > >>>>>>>>>>>>>>>> only a
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> single
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> metadata column which is the UID
> >>> > > > > > >> hash.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> @zakelly, plz share your thoughts
> >>> > > > > > >> too.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> If we opt to use metadata columns,
> >>> > > > > > >>>> every
> >>> > > > > > >>>>>>>> record
> >>> > > > > > >>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>> table
> >>> > > > > > >>>>>>>>>>>>>>>>>> would
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> end
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> up
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> having identical values for these
> >>> > > > > > >>>>> columns
> >>> > > > > > >>>>>>>>> (please
> >>> > > > > > >>>>>>>>>>>>> correct
> >>> > > > > > >>>>>>>>>>>>>>> me
> >>> > > > > > >>>>>>>>>>>>>>>> if
> >>> > > > > > >>>>>>>>>>>>>>>>>> I’m
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> mistaken). On the other hand, the
> >>> > > > > > >>>> state
> >>> > > > > > >>>>>>>>> connector
> >>> > > > > > >>>>>>>>>>>>>> requires
> >>> > > > > > >>>>>>>>>>>>>>>>> users
> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> specify
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> an operator UID or operator UID
> >>> > > > > > >> hash,
> >>> > > > > > >>>>>>> after
> >>> > > > > > >>>>>>>>> which
> >>> > > > > > >>>>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>>>>> outputs
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> user-defined
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> values in its records. This
> >>> > > > > > >> approach
> >>> > > > > > >>>>> feels
> >>> > > > > > >>>>>>>>>> somewhat
> >>> > > > > > >>>>>>>>>>>>>>> redundant
> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>> me.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would add a new
> >>> > > > > > >>>> `savepoint-metadata`
> >>> > > > > > >>>>>>>>>> connector
> >>> > > > > > >>>>>>>>>>>> then
> >>> > > > > > >>>>>>>>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> addressed.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On the other hand UID and UID hash
> >>> > > > > > >> are
> >>> > > > > > >>>>>>> having
> >>> > > > > > >>>>>>>>>>> either-or
> >>> > > > > > >>>>>>>>>>>>>>>>>> relationship
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> from
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> config perspective,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> so when a user provides the UID then
> >>> > > > > > >>>>> he/she
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>>> interested
> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> hash
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> for further calculations
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (the whole Flink internals are
> >>> > > > > > >>>> depending
> >>> > > > > > >>>>> on
> >>> > > > > > >>>>>>> the
> >>> > > > > > >>>>>>>>>>> hash).
> >>> > > > > > >>>>>>>>>>>>>>> Printing
> >>> > > > > > >>>>>>>>>>>>>>>>> out
> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> human readable UID
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> is an explicit requirement from the
> >>> > > > > > >>>> user
> >>> > > > > > >>>>>>> side
> >>> > > > > > >>>>>>>>>> because
> >>> > > > > > >>>>>>>>>>>>>> hashes
> >>> > > > > > >>>>>>>>>>>>>>>> are
> >>> > > > > > >>>>>>>>>>>>>>>>>> not
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> human
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> readable.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 3. Handling LIST and MAP States in
> >>> > > > > > >>>> the
> >>> > > > > > >>>>>>> State
> >>> > > > > > >>>>>>>>>>>> Connector
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> I have concerns about how the
> >>> > > > > > >> current
> >>> > > > > > >>>>>>> design
> >>> > > > > > >>>>>>>>>>> handles
> >>> > > > > > >>>>>>>>>>>>> LIST
> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>> MAP
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> states.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Specifically, the state connector
> >>> > > > > > >>>> uses
> >>> > > > > > >>>>>>> Flink
> >>> > > > > > >>>>>>>>>> SQL’s
> >>> > > > > > >>>>>>>>>>>> MAP
> >>> > > > > > >>>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>> ARRAY
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> types,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> which implies that it attempts to
> >>> > > > > > >>>> load
> >>> > > > > > >>>>>>> entire
> >>> > > > > > >>>>>>>>> MAP
> >>> > > > > > >>>>>>>>>>> or
> >>> > > > > > >>>>>>>>>>>>> LIST
> >>> > > > > > >>>>>>>>>>>>>>>>> states
> >>> > > > > > >>>>>>>>>>>>>>>>>>> into
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> memory.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> However, in many real-world
> >>> > > > > > >>>> scenarios,
> >>> > > > > > >>>>>>> these
> >>> > > > > > >>>>>>>>>> states
> >>> > > > > > >>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>> grow
> >>> > > > > > >>>>>>>>>>>>>>>>> very
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> large.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Typically, the state API addresses
> >>> > > > > > >>>> this
> >>> > > > > > >>>>> by
> >>> > > > > > >>>>>>>>>>> providing
> >>> > > > > > >>>>>>>>>>>> an
> >>> > > > > > >>>>>>>>>>>>>>>>> iterator
> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> traverse elements within the state
> >>> > > > > > >>>>>>>>> incrementally.
> >>> > > > > > >>>>>>>>>>> I’m
> >>> > > > > > >>>>>>>>>>>>>>> unsure
> >>> > > > > > >>>>>>>>>>>>>>>>>>> whether
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> I’ve
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> missed something in FLIP-496 or
> >>> > > > > > >>>>> FLIP-512,
> >>> > > > > > >>>>>>> but
> >>> > > > > > >>>>>>>>> it
> >>> > > > > > >>>>>>>>>>>> seems
> >>> > > > > > >>>>>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> current
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> design might struggle with
> >>> > > > > > >>>> scalability
> >>> > > > > > >>>>> in
> >>> > > > > > >>>>>>>> such
> >>> > > > > > >>>>>>>>>>> cases.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> You see it good, the current
> >>> > > > > > >>>>> implementation
> >>> > > > > > >>>>>>>> keeps
> >>> > > > > > >>>>>>>>>>> state
> >>> > > > > > >>>>>>>>>>>>>> for a
> >>> > > > > > >>>>>>>>>>>>>>>>>> single
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> key
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> in
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> memory.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Back in the days we've considered
> >>> > > > > > >> this
> >>> > > > > > >>>>>>>> potential
> >>> > > > > > >>>>>>>>>>> issue
> >>> > > > > > >>>>>>>>>>>>> and
> >>> > > > > > >>>>>>>>>>>>>>>>>> concluded
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> that
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> this is not necessarily
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> needed for the initial version and
> >>> > > > > > >> can
> >>> > > > > > >>>> be
> >>> > > > > > >>>>>>> done
> >>> > > > > > >>>>>>>>> as a
> >>> > > > > > >>>>>>>>>>>> later
> >>> > > > > > >>>>>>>>>>>>>>>>>>> improvement.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> > > > > > >>>>>>> savepoints
> >>> > > > > > >>>>>>>>> that
> >>> > > > > > >>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>> number
> >>> > > > > > >>>>>>>>>>>>>>>> of
> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>> > > > > > >>>>> state
> >>> > > > > > >>>>>>>>> itself.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >>> > > > > > >> as-is
> >>> > > > > > >>>>> and
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>>> be
> >>> > > > > > >>>>>>>>>>>>> handled
> >>> > > > > > >>>>>>>>>>>>>>> in a
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Shengkai
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > https://www.postgresql.org/docs/current/view-pg-tables.html
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [2]
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>> > > > > > >>>>> gabor.g.somo...@gmail.com>
> >>> > > > > > >>>>>>>>>>>> 于2025年3月3日周一
> >>> > > > > > >>>>>>>>>>>>>>>>> 02:00写道:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> Hi Zakelly,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> In order to shoot for simplicity
> >>> > > > > > >>>>>>> `METADATA
> >>> > > > > > >>>>>>>>>>> VIRTUAL`
> >>> > > > > > >>>>>>>>>>>>> as
> >>> > > > > > >>>>>>>>>>>>>>> key
> >>> > > > > > >>>>>>>>>>>>>>>>>> words
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> definition is the target.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> When it's not super complex the
> >>> > > > > > >>>> latter
> >>> > > > > > >>>>>>> can
> >>> > > > > > >>>>>>>> be
> >>> > > > > > >>>>>>>>>>> added
> >>> > > > > > >>>>>>>>>>>>>> too.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 2, 2025 at 3:37 PM
> >>> > > > > > >>>> Zakelly
> >>> > > > > > >>>>>>> Lan
> >>> > > > > > >>>>>>>> <
> >>> > > > > > >>>>>>>>>>>>>>>>>>> zakelly....@gmail.com>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> +1 for this.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Will the metadata column use
> >>> > > > > > >>>>> `METADATA
> >>> > > > > > >>>>>>>>>> VIRTUAL`
> >>> > > > > > >>>>>>>>>>>> as
> >>> > > > > > >>>>>>>>>>>>>> key
> >>> > > > > > >>>>>>>>>>>>>>>>> words
> >>> > > > > > >>>>>>>>>>>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> definition, or `METADATA FROM
> >>> > > > > > >> xxx
> >>> > > > > > >>>>>>>> VIRTUAL`
> >>> > > > > > >>>>>>>>>> for
> >>> > > > > > >>>>>>>>>>>>>>> renaming,
> >>> > > > > > >>>>>>>>>>>>>>>>> just
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> like
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> the
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Kafka table?
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Zakelly
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Mar 1, 2025 at 1:31 PM
> >>> > > > > > >>>> Gabor
> >>> > > > > > >>>>>>>>> Somogyi
> >>> > > > > > >>>>>>>>>> <
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> gabor.g.somo...@gmail.com>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi All,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to start a
> >>> > > > > > >> discussion
> >>> > > > > > >>>> of
> >>> > > > > > >>>>>>>>> FLIP-512:
> >>> > > > > > >>>>>>>>>>> Add
> >>> > > > > > >>>>>>>>>>>>>> meta
> >>> > > > > > >>>>>>>>>>>>>>>>>>>> information
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> SQL
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> state connector [1].
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Feel free to add your
> >>> > > > > > >> thoughts
> >>> > > > > > >>>> to
> >>> > > > > > >>>>>>> make
> >>> > > > > > >>>>>>>>> this
> >>> > > > > > >>>>>>>>>>>>> feature
> >>> > > > > > >>>>>>>>>>>>>>>>> better.
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> G
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>>
> >>> > > > > > >>>>>>>>>>
> >>> > > > > > >>>>>>>>>
> >>> > > > > > >>>>>>>>
> >>> > > > > > >>>>>>>
> >>> > > > > > >>>>>>
> >>> > > > > > >>>>>
> >>> > > > > > >>>>
> >>> > > > > > >>>
> >>> > > > > > >>
> >>> > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> >
> >>>
> >>
>

Reply via email to