Just to avoid jumping from one thread to another here's Timo's suggestion:

"if I understand the discussion correctly, you want to use a PTF without
table arguments to return a table (read from savepoint metadata)? If
this is the case, you don't need a PTF for it. A regular table function
can also do the job. IIRC we support TVF with constant args."

I've tried the TVF out and works with batch mode like charm.

> I'd also suggest we make it built-in without registration.
I basically agree to have this as built-in.

>From my perspective there are no further questions/concerns and updated the
FLIP accordingly.
If somebody has then please share, otherwise I would like to go on with the
vote.

All in all thanks for everybody for the constructive suggestions, we've
made things really better.

BR,
G


On Fri, Mar 28, 2025 at 9:19 AM Zakelly Lan <zakelly....@gmail.com> wrote:

> Hi all,
>
> Given the simplicity, I also +1 for PTF or any other function
> implementation if PTF is not applicable for this.
>
> I would like to raise a consideration regarding the usage implementation:
> > Would it be necessary to allow users to utilize the CREATE FUNCTION
> > statement for registering the PTF?
>
>
>  I'd also suggest we make it built-in without registration.
>
> Currently, Flink SQL supports letting external systems register modules and
> > leverage these modules to centrally manage all function definitions.
> Given
> > this architectural approach, I’m curious if the plan involves introducing
> > additional functions in the future. If so, I would advocate for
> introducing
> > a dedicated state module to centralize such management. This would
> empower
> > users to:
>
>
> I can’t think of any further functions for now, but I'd +1 for a module if
> it could omit the registration.
>
>
> Best,
> Zakelly.
>
>
>
> On Fri, Mar 28, 2025 at 10:25 AM Shengkai Fang <fskm...@gmail.com> wrote:
>
> > One more question about the FLIP.
> >
> > I think the output schema is definitely a public API to users. If users
> > use the `CREATE FUNCTION` statement, is it means the class path is also a
> > public API to users. Alternatively, this is merely an experimental
> feature
> > and we don't have any promise about this function.
> >
> > Best,
> > Shengkai
> >
> > Shengkai Fang <fskm...@gmail.com> 于2025年3月28日周五 10:20写道:
> >
> >> +1 to use PTF.
> >>
> >> I would like to raise a consideration regarding the usage
> implementation:
> >> Would it be necessary to allow users to utilize the CREATE FUNCTION
> >> statement for registering the PTF?
> >>
> >> Currently, Flink SQL supports letting external systems register modules
> >> and leverage these modules to centrally manage all function definitions.
> >> Given this architectural approach, I’m curious if the plan involves
> >> introducing additional functions in the future. If so, I would advocate
> for
> >> introducing a dedicated state module to centralize such management. This
> >> would empower users to:
> >>
> >> 1. Simply execute the LOAD MODULE command to load the required module,
> and
> >> 2. Directly invoke read_metadata thereafter.
> >>
> >> For more details about the module, please refer to this document[1].
> >>
> >> Best,
> >> Shengkai
> >>
> >> [1]
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/modules/
> >>
> >> Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月28日周五 00:26写道:
> >>
> >>> Just found out that PTF in batch mode is not supported, plz see the dev
> >>> mailing about it [1].
> >>>
> >>> [1] https://lists.apache.org/thread/ytm9m1qt4pq2q2gjngfktrn8vrlvkf07
> >>>
> >>> BR,
> >>> G
> >>>
> >>>
> >>> On Thu, Mar 27, 2025 at 3:38 PM Gabor Somogyi <
> gabor.g.somo...@gmail.com
> >>> >
> >>> wrote:
> >>>
> >>> > In the meantime I've just updated the FLIP according to this to be
> >>> > optimistic 🙂
> >>> >
> >>> > BR,
> >>> > G
> >>> >
> >>> > On Thu, Mar 27, 2025 at 2:15 PM Gabor Somogyi <
> >>> gabor.g.somo...@gmail.com>
> >>> > wrote:
> >>> >
> >>> >> Considering all the facts I also +1 on PTF. Even if something is
> >>> missing
> >>> >> we can add later.
> >>> >>
> >>> >> @Zakelly Lan <zakelly....@gmail.com> @Shengkai Fang are you also on
> >>> the
> >>> >> same page or have something to add?
> >>> >>
> >>> >> BR,
> >>> >> G
> >>> >>
> >>> >>
> >>> >> On Thu, Mar 27, 2025 at 1:50 PM Lincoln Lee <lincoln.8...@gmail.com
> >
> >>> >> wrote:
> >>> >>
> >>> >>> +1 for PTF
> >>> >>>
> >>> >>> > Is it possible to describe such function to see the column
> >>> names/types?
> >>> >>>
> >>> >>> Although Flink SQL does not directly support this feature, users
> can
> >>> >>> achieve
> >>> >>> similar results with the help of `explain` syntax, e.g.
> >>> >>> 'explain select * from read_state_metadata(...)'
> >>> >>>
> >>> >>>
> >>> >>> Best,
> >>> >>> Lincoln Lee
> >>> >>>
> >>> >>>
> >>> >>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月27日周四 20:41写道:
> >>> >>>
> >>> >>> > Hey!
> >>> >>> >
> >>> >>> > I think the PTF approach strikes a great balance in simplicity
> and
> >>> the
> >>> >>> > capabilities that we get out of it.
> >>> >>> >
> >>> >>> > I think this could be a completely viable alternative to the
> >>> dedicated
> >>> >>> > connector, +1.
> >>> >>> >
> >>> >>> > Cheers,
> >>> >>> > Gyula
> >>> >>> >
> >>> >>> > On Thu, Mar 27, 2025 at 10:37 AM Shengkai Fang <
> fskm...@gmail.com>
> >>> >>> wrote:
> >>> >>> >
> >>> >>> > > Hi, Gabor.
> >>> >>> > >
> >>> >>> > > > Do I understand correctly that this is 2.x only feature and
> we
> >>> >>> can't
> >>> >>> > > backport it to 1.x line
> >>> >>> > >
> >>> >>> > > Yes. PTF is only supported in 2.x verison.
> >>> >>> > >
> >>> >>> > > > Is it possible to describe such function to see the column
> >>> >>> names/types?
> >>> >>> > >
> >>> >>> > > Flink SQL doesn't support this feature, but postgres[2] or
> >>> mysql[1]
> >>> >>> has
> >>> >>> > > similar feature.
> >>> >>> > >
> >>> >>> > > [1]
> >>> >>> https://dev.mysql.com/doc/refman/8.4/en/show-create-procedure.html
> >>> >>> > > [2]
> >>> >>> > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://stackoverflow.com/questions/6898453/show-the-code-of-a-function-procedure-and-trigger-in-postgresql
> >>> >>> > >
> >>> >>> > > Best,
> >>> >>> > > Shengkai
> >>> >>> > >
> >>> >>> > >
> >>> >>> > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月27日周四
> 16:25写道:
> >>> >>> > >
> >>> >>> > > > Hi Shengkai,
> >>> >>> > > >
> >>> >>> > > > Thanks for your effort with the example, this looks
> promising.
> >>> >>> > > > I like the fact that users wouldn't need to sweat with
> complex
> >>> >>> create
> >>> >>> > > table
> >>> >>> > > > statements.
> >>> >>> > > >
> >>> >>> > > > Couple of questions:
> >>> >>> > > > * Do I understand correctly that this is 2.x only feature and
> >>> we
> >>> >>> can't
> >>> >>> > > > backport it to 1.x line?
> >>> >>> > > > I'm not intended to do any backport, just would like to know
> >>> the
> >>> >>> > > technical
> >>> >>> > > > constraints.
> >>> >>> > > > * Is it possible to describe such function to see the column
> >>> >>> > names/types?
> >>> >>> > > >
> >>> >>> > > > BR,
> >>> >>> > > > G
> >>> >>> > > >
> >>> >>> > > >
> >>> >>> > > > On Thu, Mar 27, 2025 at 3:17 AM Shengkai Fang <
> >>> fskm...@gmail.com>
> >>> >>> > wrote:
> >>> >>> > > >
> >>> >>> > > > > Many thanks for your reminder, Leonard. Here's the link I
> >>> >>> > mentioned[1].
> >>> >>> > > > >
> >>> >>> > > > > Best,
> >>> >>> > > > > Shengkai
> >>> >>> > > > >
> >>> >>> > > > > [1] https://github.com/apache/flink/pull/26358
> >>> >>> > > > >
> >>> >>> > > > > Leonard Xu <xbjt...@gmail.com> 于2025年3月27日周四 10:05写道:
> >>> >>> > > > >
> >>> >>> > > > > > Your link is broken, Shengkai
> >>> >>> > > > > >
> >>> >>> > > > > > Best,
> >>> >>> > > > > > Leonard
> >>> >>> > > > > >
> >>> >>> > > > > > > 2025年3月27日 10:01,Shengkai Fang <fskm...@gmail.com> 写道:
> >>> >>> > > > > > >
> >>> >>> > > > > > > Hi, All.
> >>> >>> > > > > > >
> >>> >>> > > > > > > I write a simple demo to illustrate my idea. Hope this
> >>> helps.
> >>> >>> > > > > > >
> >>> >>> > > > > > > Best,
> >>> >>> > > > > > > Shengkai
> >>> >>> > > > > > >
> >>> >>> > > > > > >
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://github.com/apache/flink/compare/master...fsk119:flink:example?expand=1
> >>> >>> > > > > > >
> >>> >>> > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com>
> 于2025年3月26日周三
> >>> >>> 15:54写道:
> >>> >>> > > > > > >
> >>> >>> > > > > > >>> I'm fine with a seperate SQL connector for metadata,
> so
> >>> >>> maybe
> >>> >>> > we
> >>> >>> > > > > could
> >>> >>> > > > > > >> update the FLIP about our discussion?
> >>> >>> > > > > > >>
> >>> >>> > > > > > >> Sorry, I've forgotten this part. Yeah, no matter we
> >>> choose
> >>> >>> I'm
> >>> >>> > > going
> >>> >>> > > > > to
> >>> >>> > > > > > >> update the FLIP.
> >>> >>> > > > > > >>
> >>> >>> > > > > > >> G
> >>> >>> > > > > > >>
> >>> >>> > > > > > >>
> >>> >>> > > > > > >> On Wed, Mar 26, 2025 at 8:51 AM Gabor Somogyi <
> >>> >>> > > > > > gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >> wrote:
> >>> >>> > > > > > >>
> >>> >>> > > > > > >>> Hi All,
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>> I've also lack of the knowledge of PTF so I've read
> >>> just
> >>> >>> the
> >>> >>> > > > > motivation
> >>> >>> > > > > > >>> part:
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>> "The SQL 2016 standard introduced a way of defining
> >>> custom
> >>> >>> SQL
> >>> >>> > > > > > operators
> >>> >>> > > > > > >>> defined by ISO/IEC 19075-7:2021 (Part 7: Polymorphic
> >>> table
> >>> >>> > > > > functions).
> >>> >>> > > > > > >>> ~200 pages define how this new kind of function can
> >>> >>> consume and
> >>> >>> > > > > produce
> >>> >>> > > > > > >>> tables with various execution properties.
> >>> >>> > > > > > >>> Unfortunately, this part of the standard is not
> >>> publicly
> >>> >>> > > > available."
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>> Of course we can take a look at some examples but do
> we
> >>> >>> really
> >>> >>> > > want
> >>> >>> > > > > to
> >>> >>> > > > > > >>> expose state data with this construct
> >>> >>> > > > > > >>> which is described in ~200 pages and part of the
> >>> standard
> >>> >>> is
> >>> >>> > not
> >>> >>> > > > > > publicly
> >>> >>> > > > > > >>> available? 🙂
> >>> >>> > > > > > >>> I mean the dataset is couple of rows and the use-case
> >>> is
> >>> >>> join
> >>> >>> > > with
> >>> >>> > > > > > >> another
> >>> >>> > > > > > >>> table like with state data.
> >>> >>> > > > > > >>> If somebody can give advantages I would buy that but
> >>> from
> >>> >>> my
> >>> >>> > > > limited
> >>> >>> > > > > > >>> understanding this would be an overkill here.
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>> BR,
> >>> >>> > > > > > >>> G
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>> On Wed, Mar 26, 2025 at 8:28 AM Gyula Fóra <
> >>> >>> > gyula.f...@gmail.com
> >>> >>> > > >
> >>> >>> > > > > > wrote:
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>>> Hi Zakelly , Shengkai!
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>> I don't know too much about PTFs, it would be
> >>> interesting
> >>> >>> to
> >>> >>> > see
> >>> >>> > > > how
> >>> >>> > > > > > the
> >>> >>> > > > > > >>>> usage would look in practice.
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>> Do you have some mockup/example in mind how the PTF
> >>> would
> >>> >>> look
> >>> >>> > > for
> >>> >>> > > > > > >> example
> >>> >>> > > > > > >>>> when want to:
> >>> >>> > > > > > >>>> - Simply display/aggregate whats in the metadata
> >>> >>> > > > > > >>>> - Join keyed state with some metadata columns
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>> Thanks
> >>> >>> > > > > > >>>> Gyula
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>> On Wed, Mar 26, 2025 at 7:33 AM Zakelly Lan <
> >>> >>> > > > zakelly....@gmail.com>
> >>> >>> > > > > > >>>> wrote:
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>>> Hi everyone,
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>> I'm fine with a seperate SQL connector for
> metadata,
> >>> so
> >>> >>> maybe
> >>> >>> > > we
> >>> >>> > > > > > could
> >>> >>> > > > > > >>>>> update the FLIP about our discussion? And Shengkai
> >>> >>> provides a
> >>> >>> > > PTF
> >>> >>> > > > > > >>>>> implementation, does that also meet the
> requirement?
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>> Best,
> >>> >>> > > > > > >>>>> Zakelly
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>> On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <
> >>> >>> > > > > > >>>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>> wrote:
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>>> Hi All,
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>> @Zakelly: Gyula summarised it correctly what I
> >>> meant so
> >>> >>> > please
> >>> >>> > > > > treat
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>> content as mine.
> >>> >>> > > > > > >>>>>> As an addition I'm not against to add CLI at all,
> >>> I'm
> >>> >>> just
> >>> >>> > > > stating
> >>> >>> > > > > > >>>> that
> >>> >>> > > > > > >>>>> in
> >>> >>> > > > > > >>>>>> some cases like this, users would like to have
> >>> >>> > > > > > >>>>>> a self-serving solution where they can provide SQL
> >>> >>> > statements
> >>> >>> > > > > which
> >>> >>> > > > > > >>>> can
> >>> >>> > > > > > >>>>>> trigger alerts automatically.
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>> My personal opinion is that CLI would be
> beneficial
> >>> for
> >>> >>> > > several
> >>> >>> > > > > > >>>> cases. A
> >>> >>> > > > > > >>>>>> good example is when users want to restart job
> >>> >>> > > > > > >>>>>> from specific Kafka offsets which are persisted
> in a
> >>> >>> > > savepoint.
> >>> >>> > > > > For
> >>> >>> > > > > > >>>> such
> >>> >>> > > > > > >>>>>> scenario users are more than happy since they
> >>> >>> > > > > > >>>>>> expect manual intervention with full control. So
> >>> all in
> >>> >>> all
> >>> >>> > > one
> >>> >>> > > > > can
> >>> >>> > > > > > >>>> count
> >>> >>> > > > > > >>>>>> on my +1 when CLI FLIP would come up...
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>> BR,
> >>> >>> > > > > > >>>>>> G
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <
> >>> >>> > > > gyula.f...@gmail.com>
> >>> >>> > > > > > >>>> wrote:
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>>> Hi!
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>> @Zakelly Lan <zakelly....@gmail.com>
> >>> >>> > > > > > >>>>>>> I think what Gabor means is that users want to
> have
> >>> >>> > > predefined
> >>> >>> > > > > SQL
> >>> >>> > > > > > >>>>> scripts
> >>> >>> > > > > > >>>>>>> to perform state analysis tasks to debug/identify
> >>> >>> problems.
> >>> >>> > > > > > >>>>>>> Such as write a SQL script that joins the
> metadata
> >>> >>> table
> >>> >>> > with
> >>> >>> > > > the
> >>> >>> > > > > > >>>> state
> >>> >>> > > > > > >>>>>>> and
> >>> >>> > > > > > >>>>>>> do some analytics on it.
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>> If we have a meta table then the SQL script that
> >>> can do
> >>> >>> > this
> >>> >>> > > is
> >>> >>> > > > > > >> fixed
> >>> >>> > > > > > >>>>> and
> >>> >>> > > > > > >>>>>>> users can trigger this on demand by simply
> >>> providing a
> >>> >>> new
> >>> >>> > > > > > >> savepoint
> >>> >>> > > > > > >>>>> path.
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>> If we have a different mechanism to extract
> >>> metadata
> >>> >>> that
> >>> >>> > is
> >>> >>> > > > not
> >>> >>> > > > > > >> SQL
> >>> >>> > > > > > >>>>>>> native
> >>> >>> > > > > > >>>>>>> then manual steps need to be executed and a
> custom
> >>> SQL
> >>> >>> > script
> >>> >>> > > > > would
> >>> >>> > > > > > >>>> need
> >>> >>> > > > > > >>>>>>> to
> >>> >>> > > > > > >>>>>>> be written that adds the manually extracted
> >>> metadata
> >>> >>> into
> >>> >>> > the
> >>> >>> > > > > > >> script.
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>> Cheers,
> >>> >>> > > > > > >>>>>>> Gyula
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <
> >>> >>> > > > > zakelly....@gmail.com
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>>> Hi all,
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> Thanks for your answers! Getting everyone
> aligned
> >>> on
> >>> >>> this
> >>> >>> > > > topic
> >>> >>> > > > > > >> is
> >>> >>> > > > > > >>>>>>>> challenging, but it’s definitely worth the
> effort
> >>> >>> since it
> >>> >>> > > > will
> >>> >>> > > > > > >>>> help
> >>> >>> > > > > > >>>>>>>> streamline things moving forward.
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> @Gabor are you saying that users are using some
> >>> >>> scripts to
> >>> >>> > > > > define
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>> SQL
> >>> >>> > > > > > >>>>>>>> metadata connector and get the information,
> >>> right? If
> >>> >>> so,
> >>> >>> > > > would
> >>> >>> > > > > a
> >>> >>> > > > > > >>>> CLI
> >>> >>> > > > > > >>>>>>> tool
> >>> >>> > > > > > >>>>>>>> be more convenient? It's easy to invoke and can
> >>> get
> >>> >>> the
> >>> >>> > > result
> >>> >>> > > > > > >>>>> swiftly.
> >>> >>> > > > > > >>>>>>> And
> >>> >>> > > > > > >>>>>>>> there should be some other systems to track the
> >>> >>> checkpoint
> >>> >>> > > > > > >> lineage
> >>> >>> > > > > > >>>> and
> >>> >>> > > > > > >>>>>>>> analyze if there are outliers in metadata (e.g.
> >>> state
> >>> >>> size
> >>> >>> > > of
> >>> >>> > > > > one
> >>> >>> > > > > > >>>>>>> operator)
> >>> >>> > > > > > >>>>>>>> right? Well, maybe I missed something so please
> >>> >>> correct me
> >>> >>> > > if
> >>> >>> > > > > I'm
> >>> >>> > > > > > >>>>> wrong.
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> I think the overall vision in Flink SQL is to
> >>> provide
> >>> >>> a
> >>> >>> > SQL
> >>> >>> > > > > > >> native
> >>> >>> > > > > > >>>>>>>>> environment where we can serve complex
> use-cases
> >>> >>> like you
> >>> >>> > > > would
> >>> >>> > > > > > >>>>> expect
> >>> >>> > > > > > >>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>> regular database.
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> @Gyula Well, this is a good point. From the
> >>> >>> perspective of
> >>> >>> > > > > > >>>>> comprehensive
> >>> >>> > > > > > >>>>>>>> SQL experience, I'd +1 for treating metadata as
> >>> data.
> >>> >>> > > > Although I
> >>> >>> > > > > > >>>> doubt
> >>> >>> > > > > > >>>>>>> if
> >>> >>> > > > > > >>>>>>>> there is a need for processing metadata, I won't
> >>> be
> >>> >>> > against
> >>> >>> > > a
> >>> >>> > > > > > >>>> separate
> >>> >>> > > > > > >>>>>>>> connector.
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> Regarding the CLI tool, I still think it’s worth
> >>> >>> > > implementing.
> >>> >>> > > > > > >>>> Such a
> >>> >>> > > > > > >>>>>>> tool
> >>> >>> > > > > > >>>>>>>> could provide savepoint information before
> >>> resuming
> >>> >>> from a
> >>> >>> > > > > > >>>> savepoint,
> >>> >>> > > > > > >>>>>>> which
> >>> >>> > > > > > >>>>>>>> would enhance the user experience in CLI-based
> >>> >>> workflows.
> >>> >>> > It
> >>> >>> > > > > > >> would
> >>> >>> > > > > > >>>> be
> >>> >>> > > > > > >>>>>>> good
> >>> >>> > > > > > >>>>>>>> if someone could implement this feature. We
> >>> shouldn’t
> >>> >>> > worry
> >>> >>> > > > > about
> >>> >>> > > > > > >>>>>>> whether
> >>> >>> > > > > > >>>>>>>> this tool might be retired in the future.
> >>> Regardless
> >>> >>> of
> >>> >>> > the
> >>> >>> > > > > > >>>> SQL-based
> >>> >>> > > > > > >>>>>>>> solution we eventually adopt, this capability
> will
> >>> >>> remain
> >>> >>> > > > > > >> essential
> >>> >>> > > > > > >>>>> for
> >>> >>> > > > > > >>>>>>> CLI
> >>> >>> > > > > > >>>>>>>> users. This is another topic.
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>> Zakelly
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>> On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang <
> >>> >>> > > > > > >> fskm...@gmail.com>
> >>> >>> > > > > > >>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>>> Hi.
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> After reading the doc[1], I think Spark
> provides
> >>> a
> >>> >>> > function
> >>> >>> > > > for
> >>> >>> > > > > > >>>>> users
> >>> >>> > > > > > >>>>>>> to
> >>> >>> > > > > > >>>>>>>>> consume the metadata from the savepoint.  In
> >>> Flink
> >>> >>> SQL,
> >>> >>> > > > similar
> >>> >>> > > > > > >>>>>>>>> functionality is implemented through
> Polymorphic
> >>> >>> Table
> >>> >>> > > > > > >> Functions
> >>> >>> > > > > > >>>>>>> (PTF) as
> >>> >>> > > > > > >>>>>>>>> proposed in FLIP-440[2]. Below is a code
> >>> example[3]
> >>> >>> > > > > > >> illustrating
> >>> >>> > > > > > >>>>> this
> >>> >>> > > > > > >>>>>>>>> concept:
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> ```
> >>> >>> > > > > > >>>>>>>>>    public static class ScalarArgsFunction
> extends
> >>> >>> > > > > > >>>>>>>>> TestProcessTableFunctionBase {
> >>> >>> > > > > > >>>>>>>>>        public void eval(Integer i, Boolean b) {
> >>> >>> > > > > > >>>>>>>>>            collectObjects(i, b);
> >>> >>> > > > > > >>>>>>>>>        }
> >>> >>> > > > > > >>>>>>>>>    }
> >>> >>> > > > > > >>>>>>>>> ```
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> ```
> >>> >>> > > > > > >>>>>>>>> INSERT INTO sink SELECT * FROM f(i => 42, b =>
> >>> >>> > CAST('TRUE'
> >>> >>> > > AS
> >>> >>> > > > > > >>>>>>> BOOLEAN))
> >>> >>> > > > > > >>>>>>>>> ``
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> So we can add a builtin function named
> >>> >>> > > `read_state_metadata`
> >>> >>> > > > to
> >>> >>> > > > > > >>>> read
> >>> >>> > > > > > >>>>>>>>> savepoint data.
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL
> >>> >>> > > > > > >>>>>>>>> [2]
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
> >>> >>> > > > > > >>>>>>>>> [3]
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>> Gyula Fóra <gyula.f...@gmail.com>
> 于2025年3月19日周三
> >>> >>> 18:37写道:
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> Hi All!
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> Thank you for the answers and concerns from
> >>> >>> everyone.
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> On the CLI vs State Metadata Connector/Table
> >>> >>> question I
> >>> >>> > > > would
> >>> >>> > > > > > >>>> also
> >>> >>> > > > > > >>>>>>> like
> >>> >>> > > > > > >>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>> step back a little and look at the bigger
> >>> picture.
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> I think the overall vision in Flink SQL is to
> >>> >>> provide a
> >>> >>> > > SQL
> >>> >>> > > > > > >>>> native
> >>> >>> > > > > > >>>>>>>>>> environment where we can serve complex
> use-cases
> >>> >>> like
> >>> >>> > you
> >>> >>> > > > > > >> would
> >>> >>> > > > > > >>>>>>> expect
> >>> >>> > > > > > >>>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>>> regular database.
> >>> >>> > > > > > >>>>>>>>>> Most features, developments in the recent
> years
> >>> have
> >>> >>> > gone
> >>> >>> > > > > > >> this
> >>> >>> > > > > > >>>>> way.
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> The State Metadata Table would be a natural
> and
> >>> >>> > > > > > >> straightforward
> >>> >>> > > > > > >>>>> fit
> >>> >>> > > > > > >>>>>>>> here.
> >>> >>> > > > > > >>>>>>>>>> So from my side, +1 for that.
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> However I could understand if we are not ready
> >>> to
> >>> >>> add a
> >>> >>> > > new
> >>> >>> > > > > > >>>>>>>>>> connector/format due to maintenance concerns
> >>> (and in
> >>> >>> > > general
> >>> >>> > > > > > >>>>> concern
> >>> >>> > > > > > >>>>>>>>> about
> >>> >>> > > > > > >>>>>>>>>> the design).
> >>> >>> > > > > > >>>>>>>>>> If that's the issue then we should spend more
> >>> time
> >>> >>> on
> >>> >>> > the
> >>> >>> > > > > > >>>> design
> >>> >>> > > > > > >>>>> to
> >>> >>> > > > > > >>>>>>> get
> >>> >>> > > > > > >>>>>>>>>> comfortable with the approach and seek
> feedback
> >>> >>> from the
> >>> >>> > > > > > >> wider
> >>> >>> > > > > > >>>>>>>> community
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> I am -1 for the CLI/tooling approach as that
> >>> will
> >>> >>> not
> >>> >>> > > > provide
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>>> featureset we are looking for that is not
> >>> already
> >>> >>> > covered
> >>> >>> > > by
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>> Java
> >>> >>> > > > > > >>>>>>>>>> connector. And that approach would come with
> the
> >>> >>> same
> >>> >>> > > > > > >>>> maintenance
> >>> >>> > > > > > >>>>>>>>>> implications.
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> Cheers
> >>> >>> > > > > > >>>>>>>>>> Gyula
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>> On Wed, Mar 19, 2025 at 11:24 AM Gabor
> Somogyi <
> >>> >>> > > > > > >>>>>>>>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> Hi Zaklely, Shengkai
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> Several topics are going on so adding gist
> >>> answers
> >>> >>> to
> >>> >>> > > them.
> >>> >>> > > > > > >>>> When
> >>> >>> > > > > > >>>>>>> some
> >>> >>> > > > > > >>>>>>>>>> topic
> >>> >>> > > > > > >>>>>>>>>>> is not touched please highlight it.
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> @Shengkai: I've read through all the previous
> >>> FLIPs
> >>> >>> > > related
> >>> >>> > > > > > >>>>>>> catalogs
> >>> >>> > > > > > >>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>> if
> >>> >>> > > > > > >>>>>>>>>>> we would like to keep the concepts there
> >>> >>> > > > > > >>>>>>>>>>> then one-to-one mapping relationship between
> >>> >>> savepoint
> >>> >>> > > and
> >>> >>> > > > > > >>>>> catalog
> >>> >>> > > > > > >>>>>>>> is a
> >>> >>> > > > > > >>>>>>>>>>> reasonable direction. In short I'm happy that
> >>> >>> > > > > > >>>>>>>>>>> you've highlighted this and agree as a whole.
> >>> I've
> >>> >>> > > written
> >>> >>> > > > > > >> it
> >>> >>> > > > > > >>>>> down
> >>> >>> > > > > > >>>>>>>>>>> previously, just want to double confirm that
> >>> state
> >>> >>> > > catalog
> >>> >>> > > > > > >> is
> >>> >>> > > > > > >>>>>>>>>>> essential and planned. When we reach this
> point
> >>> >>> then
> >>> >>> > your
> >>> >>> > > > > > >>>> input
> >>> >>> > > > > > >>>>> is
> >>> >>> > > > > > >>>>>>>> more
> >>> >>> > > > > > >>>>>>>>>>> than welcome.
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> @Zakelly: We've tried the CLI and separate
> >>> library
> >>> >>> > > > > > >> approaches
> >>> >>> > > > > > >>>>> with
> >>> >>> > > > > > >>>>>>>>> users
> >>> >>> > > > > > >>>>>>>>>>> already and these are not something which is
> >>> >>> welcome
> >>> >>> > > > > > >> because
> >>> >>> > > > > > >>>> of
> >>> >>> > > > > > >>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>> following:
> >>> >>> > > > > > >>>>>>>>>>> * Users want to have automated tasks and not
> >>> manual
> >>> >>> > > > > > >>>> CLI/library
> >>> >>> > > > > > >>>>>>>> output
> >>> >>> > > > > > >>>>>>>>>>> parsing. This can be hacked around but our
> >>> >>> experience
> >>> >>> > is
> >>> >>> > > > > > >>>>> negative
> >>> >>> > > > > > >>>>>>> on
> >>> >>> > > > > > >>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>> because it's just brittle.
> >>> >>> > > > > > >>>>>>>>>>> * From development perspective It's way much
> >>> bigger
> >>> >>> > > effort
> >>> >>> > > > > > >>>> than
> >>> >>> > > > > > >>>>> a
> >>> >>> > > > > > >>>>>>>>>> connector
> >>> >>> > > > > > >>>>>>>>>>> (hard to test, packaging/version handling is
> >>> and
> >>> >>> extra
> >>> >>> > > > > > >> layer
> >>> >>> > > > > > >>>> of
> >>> >>> > > > > > >>>>>>>>>> complexity,
> >>> >>> > > > > > >>>>>>>>>>> external FS authentication is pain for users,
> >>> >>> expecting
> >>> >>> > > > > > >> them
> >>> >>> > > > > > >>>> to
> >>> >>> > > > > > >>>>>>>>> download
> >>> >>> > > > > > >>>>>>>>>>> savepoints also)
> >>> >>> > > > > > >>>>>>>>>>> * Purely personal opinion but if we would
> find
> >>> >>> better
> >>> >>> > > ways
> >>> >>> > > > > > >>>> later
> >>> >>> > > > > > >>>>>>> then
> >>> >>> > > > > > >>>>>>>>>>> retire a CLI is not more lightweight than
> >>> retire a
> >>> >>> > > > > > >> connector
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> It would be great if you give some examples
> >>> on how
> >>> >>> > user
> >>> >>> > > > > > >>>> could
> >>> >>> > > > > > >>>>>>>>> leverage
> >>> >>> > > > > > >>>>>>>>>>> the separate connector to process the
> metadata.
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> The most simplest cases:
> >>> >>> > > > > > >>>>>>>>>>> * give me the overgroving state uids
> >>> >>> > > > > > >>>>>>>>>>> * give me the not known (new or renamed)
> state
> >>> uids
> >>> >>> > > > > > >>>>>>>>>>> * give me the state uids where state size
> >>> >>> drastically
> >>> >>> > > > > > >> dropped
> >>> >>> > > > > > >>>>>>> compare
> >>> >>> > > > > > >>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>> previous savepoint (accidental state loss)
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> Since it was mentioned: as a general offtopic
> >>> >>> teaser,
> >>> >>> > > yeah
> >>> >>> > > > > > >> it
> >>> >>> > > > > > >>>>>>> would
> >>> >>> > > > > > >>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>> good
> >>> >>> > > > > > >>>>>>>>>>> to have some sort of checkpoint/savepoint
> >>> lineage
> >>> >>> or
> >>> >>> > > > > > >> however
> >>> >>> > > > > > >>>> we
> >>> >>> > > > > > >>>>>>> call
> >>> >>> > > > > > >>>>>>>>> it.
> >>> >>> > > > > > >>>>>>>>>>> Since we've not yet reached this point there
> >>> are no
> >>> >>> > > > > > >> technical
> >>> >>> > > > > > >>>>>>>> details,
> >>> >>> > > > > > >>>>>>>>>> it's
> >>> >>> > > > > > >>>>>>>>>>> more like a vision. It's a common pattern
> that
> >>> >>> > > > > > >>>>>>>>>>> jobs are physically running but somehow the
> >>> state
> >>> >>> > > > > > >> processing
> >>> >>> > > > > > >>>> is
> >>> >>> > > > > > >>>>>>> stuck
> >>> >>> > > > > > >>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>> it would be good to add some way to find it
> out
> >>> >>> > > > > > >>>> automatically.
> >>> >>> > > > > > >>>>>>>>>>> The important saying here is automation and
> not
> >>> >>> manual
> >>> >>> > > > > > >>>>> evaluation
> >>> >>> > > > > > >>>>>>>> since
> >>> >>> > > > > > >>>>>>>>>>> handling 10k+ jobs is just not allowing that.
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>> On Wed, Mar 19, 2025 at 6:46 AM Shengkai
> Fang <
> >>> >>> > > > > > >>>>> fskm...@gmail.com>
> >>> >>> > > > > > >>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> Hi, All.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> About State Catalog, I want to share more
> >>> thoughts
> >>> >>> > about
> >>> >>> > > > > > >>>> this.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> In the initial design concept, I understood
> >>> that a
> >>> >>> > > > > > >>>> savepoint
> >>> >>> > > > > > >>>>>>> and a
> >>> >>> > > > > > >>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>> catalog have a one-to-one mapping
> >>> relationship.
> >>> >>> Each
> >>> >>> > > > > > >>>> operator
> >>> >>> > > > > > >>>>>>>>>> corresponds
> >>> >>> > > > > > >>>>>>>>>>>> to a database, and the state of each
> operator
> >>> is
> >>> >>> > > > > > >>>> represented
> >>> >>> > > > > > >>>>> as
> >>> >>> > > > > > >>>>>>>>>>> individual
> >>> >>> > > > > > >>>>>>>>>>>> tables. The rationale behind this design is:
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> *State Diversity*: An operator may involve
> >>> >>> multiple
> >>> >>> > > types
> >>> >>> > > > > > >>>> of
> >>> >>> > > > > > >>>>>>>> states.
> >>> >>> > > > > > >>>>>>>>>> For
> >>> >>> > > > > > >>>>>>>>>>>> example, in our VVR design, a "multi-join"
> >>> >>> operator
> >>> >>> > uses
> >>> >>> > > > > > >>>> keyed
> >>> >>> > > > > > >>>>>>>> states
> >>> >>> > > > > > >>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>> two input streams and a broadcast state for
> >>> the
> >>> >>> third
> >>> >>> > > > > > >>>> stream.
> >>> >>> > > > > > >>>>>>> This
> >>> >>> > > > > > >>>>>>>>>> makes
> >>> >>> > > > > > >>>>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>> challenging to represent all states of an
> >>> operator
> >>> >>> > > > > > >> within a
> >>> >>> > > > > > >>>>>>> single
> >>> >>> > > > > > >>>>>>>>>> table.
> >>> >>> > > > > > >>>>>>>>>>>> *Scalability*: Internally, an operator might
> >>> have
> >>> >>> > > > > > >> multiple
> >>> >>> > > > > > >>>>> keyed
> >>> >>> > > > > > >>>>>>>>> states
> >>> >>> > > > > > >>>>>>>>>>>> (e.g., value state and list state). However,
> >>> large
> >>> >>> > list
> >>> >>> > > > > > >>>> states
> >>> >>> > > > > > >>>>>>> may
> >>> >>> > > > > > >>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>> fit
> >>> >>> > > > > > >>>>>>>>>>>> entirely in memory. To address this, we
> >>> recommend
> >>> >>> > > > > > >>>> implementing
> >>> >>> > > > > > >>>>>>> each
> >>> >>> > > > > > >>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>> as a separate table.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> To resolve the loosely coupled relationships
> >>> >>> between
> >>> >>> > > > > > >>>> operator
> >>> >>> > > > > > >>>>>>>> states,
> >>> >>> > > > > > >>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>> propose embedding predefined views within
> the
> >>> >>> catalog.
> >>> >>> > > > > > >>>> These
> >>> >>> > > > > > >>>>>>> views
> >>> >>> > > > > > >>>>>>>>>>> simplify
> >>> >>> > > > > > >>>>>>>>>>>> user understanding of operator
> >>> implementations and
> >>> >>> > > > > > >> provide
> >>> >>> > > > > > >>>> a
> >>> >>> > > > > > >>>>>>> more
> >>> >>> > > > > > >>>>>>>>>>> intuitive
> >>> >>> > > > > > >>>>>>>>>>>> perspective. For instance, a join operator
> may
> >>> >>> have
> >>> >>> > > > > > >>>> multiple
> >>> >>> > > > > > >>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>> implementations (depending on whether the
> >>> join key
> >>> >>> > > > > > >> includes
> >>> >>> > > > > > >>>>>>> unique
> >>> >>> > > > > > >>>>>>>>>>>> attributes), but users primarily care about
> >>> the
> >>> >>> data
> >>> >>> > > > > > >>>>> associated
> >>> >>> > > > > > >>>>>>>> with
> >>> >>> > > > > > >>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>>> specific join key across input streams.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> Returning to the one-to-one mapping between
> >>> >>> savepoints
> >>> >>> > > > > > >> and
> >>> >>> > > > > > >>>>>>>> catalogs,
> >>> >>> > > > > > >>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>> aim
> >>> >>> > > > > > >>>>>>>>>>>> to manage multiple user state catalogs
> >>> through a
> >>> >>> > catalog
> >>> >>> > > > > > >>>>> store.
> >>> >>> > > > > > >>>>>>>> When
> >>> >>> > > > > > >>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>> user
> >>> >>> > > > > > >>>>>>>>>>>> triggers a savepoint for a job on the
> >>> platform:
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> 1. The platform sends a REST request to the
> >>> >>> > JobManager.
> >>> >>> > > > > > >>>>>>>>>>>> 2. Simultaneously, it registers a new state
> >>> >>> catalog in
> >>> >>> > > > > > >> the
> >>> >>> > > > > > >>>>>>> catalog
> >>> >>> > > > > > >>>>>>>>>> store,
> >>> >>> > > > > > >>>>>>>>>>>> enabling immediate analysis of state data on
> >>> the
> >>> >>> > > > > > >> platform.
> >>> >>> > > > > > >>>>>>>>>>>> 3. Deleting a savepoint would also trigger
> the
> >>> >>> removal
> >>> >>> > > of
> >>> >>> > > > > > >>>> its
> >>> >>> > > > > > >>>>>>>>>> associated
> >>> >>> > > > > > >>>>>>>>>>>> catalog.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> This vision assumes that states are
> >>> >>> self-describing or
> >>> >>> > > > > > >>>> that a
> >>> >>> > > > > > >>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>> metaservice is introduced to analyze
> savepoint
> >>> >>> > > > > > >> structures.
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> How can users create logic to identify
> >>> >>> differences
> >>> >>> > > > > > >>>> between
> >>> >>> > > > > > >>>>>>>> multiple
> >>> >>> > > > > > >>>>>>>>>>>> savepoints?
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> Since savepoints and state catalogs are
> >>> one-to-one
> >>> >>> > > > > > >> mapped,
> >>> >>> > > > > > >>>>> users
> >>> >>> > > > > > >>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>> query
> >>> >>> > > > > > >>>>>>>>>>>> metadata via their respective catalogs. For
> >>> >>> example:
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> 1.
> >>> >>> > > > > > >>>>>
> >>> >>> `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>`
> >>> >>> > > > > > >>>>>>>>>> provides
> >>> >>> > > > > > >>>>>>>>>>>> operator-specific metadata (e.g., state
> size,
> >>> >>> type).
> >>> >>> > > > > > >>>>>>>>>>>> 2. Comparing metadata tables (e.g., schema
> >>> >>> versions,
> >>> >>> > > > > > >> state
> >>> >>> > > > > > >>>>> entry
> >>> >>> > > > > > >>>>>>>>>> counts)
> >>> >>> > > > > > >>>>>>>>>>>> across catalogs reveals structural or
> >>> quantitative
> >>> >>> > > > > > >>>>> differences.
> >>> >>> > > > > > >>>>>>>>>>>> 3. For deeper analysis, users could write
> SQL
> >>> >>> queries
> >>> >>> > to
> >>> >>> > > > > > >>>>> compare
> >>> >>> > > > > > >>>>>>>>>> specific
> >>> >>> > > > > > >>>>>>>>>>>> state partitions or leverage the metaservice
> >>> to
> >>> >>> track
> >>> >>> > > > > > >> state
> >>> >>> > > > > > >>>>>>>> evolution
> >>> >>> > > > > > >>>>>>>>>>>> (e.g., added/removed operators, modified
> state
> >>> >>> > > > > > >>>>> configurations).
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> If we plan to introduce a state catalog in
> the
> >>> >>> > future, I
> >>> >>> > > > > > >>>> would
> >>> >>> > > > > > >>>>>>> lean
> >>> >>> > > > > > >>>>>>>>>>> toward
> >>> >>> > > > > > >>>>>>>>>>>> using metadata tables. If a utility tool can
> >>> >>> address
> >>> >>> > the
> >>> >>> > > > > > >>>>>>> challenges
> >>> >>> > > > > > >>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>> face, could we avoid introducing an
> additional
> >>> >>> > > connector?
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> Gyula Fóra <gyula.f...@gmail.com>
> >>> 于2025年3月17日周一
> >>> >>> > > 20:25写道:
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> Hi All!
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> Without going into too much detail here are
> >>> my 2
> >>> >>> > cents
> >>> >>> > > > > > >>>>>>> regarding
> >>> >>> > > > > > >>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> virtual column / catalog metadata / table
> >>> >>> (connector)
> >>> >>> > > > > > >>>>>>> discussion
> >>> >>> > > > > > >>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> State metadata.
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> State metadata such as the types of states,
> >>> their
> >>> >>> > > > > > >>>>> properties,
> >>> >>> > > > > > >>>>>>>>> names,
> >>> >>> > > > > > >>>>>>>>>>>> sizes
> >>> >>> > > > > > >>>>>>>>>>>>> etc are all valuable information that can
> be
> >>> >>> used to
> >>> >>> > > > > > >>>> enrich
> >>> >>> > > > > > >>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> computations we do on state.
> >>> >>> > > > > > >>>>>>>>>>>>> We can either analyze it standalone (such
> as
> >>> >>> discover
> >>> >>> > > > > > >>>>>>> anomalies,
> >>> >>> > > > > > >>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>> large
> >>> >>> > > > > > >>>>>>>>>>>>> jobs with many states), across multiple
> >>> >>> savepoints
> >>> >>> > > > > > >>>> (discover
> >>> >>> > > > > > >>>>>>> how
> >>> >>> > > > > > >>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>> changed over time) or by joining it with
> >>> keyed or
> >>> >>> > > > > > >>>> non-keyed
> >>> >>> > > > > > >>>>>>> state
> >>> >>> > > > > > >>>>>>>>>> data
> >>> >>> > > > > > >>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>> serve more complex queries on the state.
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> The only solution that seems to serve all
> >>> these
> >>> >>> > > > > > >> use-cases
> >>> >>> > > > > > >>>>> and
> >>> >>> > > > > > >>>>>>>>>>>> requirements
> >>> >>> > > > > > >>>>>>>>>>>>> in a straightforward and SQL canonical way
> >>> is to
> >>> >>> > simply
> >>> >>> > > > > > >>>>> expose
> >>> >>> > > > > > >>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>> metadata as a separate table. This is a
> >>> metadata
> >>> >>> > table
> >>> >>> > > > > > >>>> but
> >>> >>> > > > > > >>>>> you
> >>> >>> > > > > > >>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>> also
> >>> >>> > > > > > >>>>>>>>>>>>> think of it as data table, it makes no
> >>> practical
> >>> >>> > > > > > >>>> difference
> >>> >>> > > > > > >>>>>>> here.
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> Once we have a catalog later, the catalog
> can
> >>> >>> offer
> >>> >>> > > > > > >> this
> >>> >>> > > > > > >>>>> table
> >>> >>> > > > > > >>>>>>>> out
> >>> >>> > > > > > >>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> box, the same way databases provide
> metadata
> >>> >>> tables.
> >>> >>> > > > > > >> For
> >>> >>> > > > > > >>>>> this
> >>> >>> > > > > > >>>>>>> to
> >>> >>> > > > > > >>>>>>>>> work
> >>> >>> > > > > > >>>>>>>>>>>>> however we need another, simpler connector
> >>> that
> >>> >>> > creates
> >>> >>> > > > > > >>>> this
> >>> >>> > > > > > >>>>>>>> table.
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> +1 for state metadata as a separate
> >>> >>> connector/table,
> >>> >>> > > > > > >>>> instead
> >>> >>> > > > > > >>>>>>> of
> >>> >>> > > > > > >>>>>>>>>> adding
> >>> >>> > > > > > >>>>>>>>>>>>> virtual columns and adhoc catalog metadata
> >>> that
> >>> >>> is
> >>> >>> > hard
> >>> >>> > > > > > >>>> to
> >>> >>> > > > > > >>>>> use
> >>> >>> > > > > > >>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>>>> large
> >>> >>> > > > > > >>>>>>>>>>>>> number of queries.
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> Cheers,
> >>> >>> > > > > > >>>>>>>>>>>>> Gyula
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>> On Mon, Mar 17, 2025 at 12:44 PM Gabor
> >>> Somogyi <
> >>> >>> > > > > > >>>>>>>>>>>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>> I’m planning on adding this, and we may
> >>> >>> collaborate
> >>> >>> > > > > > >>>> on
> >>> >>> > > > > > >>>>> it
> >>> >>> > > > > > >>>>>>> in
> >>> >>> > > > > > >>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> future.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> +1 on this, just ping me.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> After some code digging and POC all I can
> >>> say
> >>> >>> that
> >>> >>> > > > > > >> with
> >>> >>> > > > > > >>>>>>> heavy
> >>> >>> > > > > > >>>>>>>>>> effort
> >>> >>> > > > > > >>>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>> maybe add such changes that we're able to
> >>> show
> >>> >>> > > > > > >> metadata
> >>> >>> > > > > > >>>>> of a
> >>> >>> > > > > > >>>>>>>>>>> savepoint
> >>> >>> > > > > > >>>>>>>>>>>>> from
> >>> >>> > > > > > >>>>>>>>>>>>>> catalog.
> >>> >>> > > > > > >>>>>>>>>>>>>> I'm not against that but from user
> >>> perspective
> >>> >>> this
> >>> >>> > > > > > >> has
> >>> >>> > > > > > >>>>>>> limited
> >>> >>> > > > > > >>>>>>>>>>> value,
> >>> >>> > > > > > >>>>>>>>>>>>> let
> >>> >>> > > > > > >>>>>>>>>>>>>> me explain why.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> From high level perspective I see the
> >>> following
> >>> >>> > > > > > >> which I
> >>> >>> > > > > > >>>>> see
> >>> >>> > > > > > >>>>>>>>>> agreement
> >>> >>> > > > > > >>>>>>>>>>>> on:
> >>> >>> > > > > > >>>>>>>>>>>>>> * We should have a catalog which is
> >>> >>> representing one
> >>> >>> > > > > > >> or
> >>> >>> > > > > > >>>>> more
> >>> >>> > > > > > >>>>>>>> jobs
> >>> >>> > > > > > >>>>>>>>>>>>> savepoint
> >>> >>> > > > > > >>>>>>>>>>>>>> data set (future plan)
> >>> >>> > > > > > >>>>>>>>>>>>>> * Savepoints should be able to be
> >>> registered in
> >>> >>> the
> >>> >>> > > > > > >>>>> catalog
> >>> >>> > > > > > >>>>>>>> which
> >>> >>> > > > > > >>>>>>>>>> are
> >>> >>> > > > > > >>>>>>>>>>>>> then
> >>> >>> > > > > > >>>>>>>>>>>>>> databases (future plan)
> >>> >>> > > > > > >>>>>>>>>>>>>> * There must be a possiblity to create
> >>> tables
> >>> >>> from
> >>> >>> > > > > > >>>>> databases
> >>> >>> > > > > > >>>>>>>>> where
> >>> >>> > > > > > >>>>>>>>>>>> users
> >>> >>> > > > > > >>>>>>>>>>>>>> can read state data (exists already)
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> In terms of metadata, If I understand
> >>> correctly
> >>> >>> then
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>> suggested
> >>> >>> > > > > > >>>>>>>>>>>>> approach
> >>> >>> > > > > > >>>>>>>>>>>>>> would be to access
> >>> >>> > > > > > >>>>>>>>>>>>>> it from the catalog describe command,
> right?
> >>> >>> Adding
> >>> >>> > > > > > >>>> that
> >>> >>> > > > > > >>>>>>> info
> >>> >>> > > > > > >>>>>>>>> when
> >>> >>> > > > > > >>>>>>>>>>>>> specific
> >>> >>> > > > > > >>>>>>>>>>>>>> database describe command
> >>> >>> > > > > > >>>>>>>>>>>>>> is executed could be done.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> The question is for instance how can users
> >>> >>> create
> >>> >>> > > > > > >> such
> >>> >>> > > > > > >>>> a
> >>> >>> > > > > > >>>>>>> logic
> >>> >>> > > > > > >>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>> tells
> >>> >>> > > > > > >>>>>>>>>>>>>> them what is
> >>> >>> > > > > > >>>>>>>>>>>>>> the difference between multiple
> savepoints?
> >>> >>> > > > > > >>>>>>>>>>>>>> Just to give some examples:
> >>> >>> > > > > > >>>>>>>>>>>>>> * per operator size changes between
> >>> savepoints
> >>> >>> > > > > > >>>>>>>>>>>>>> * show values from operator data where
> state
> >>> >>> size
> >>> >>> > > > > > >>>> reaches
> >>> >>> > > > > > >>>>> a
> >>> >>> > > > > > >>>>>>>>>> boundary
> >>> >>> > > > > > >>>>>>>>>>>>>> * in general "find which checkpoint ruined
> >>> >>> things"
> >>> >>> > is
> >>> >>> > > > > > >>>>> quite
> >>> >>> > > > > > >>>>>>>>> common
> >>> >>> > > > > > >>>>>>>>>>>>> pattern
> >>> >>> > > > > > >>>>>>>>>>>>>> What I would like to highlight here is
> that
> >>> from
> >>> >>> > > > > > >> Flink
> >>> >>> > > > > > >>>>>>> point of
> >>> >>> > > > > > >>>>>>>>>> view
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> metadata can be
> >>> >>> > > > > > >>>>>>>>>>>>>> considered as a static side output
> >>> information
> >>> >>> but
> >>> >>> > > > > > >> for
> >>> >>> > > > > > >>>>> users
> >>> >>> > > > > > >>>>>>>>> these
> >>> >>> > > > > > >>>>>>>>>>>> values
> >>> >>> > > > > > >>>>>>>>>>>>>> are actual real data
> >>> >>> > > > > > >>>>>>>>>>>>>> where logic is planned to build around.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>> The metadata is more like one-time
> >>> information
> >>> >>> > > > > > >>>> instead
> >>> >>> > > > > > >>>>> of
> >>> >>> > > > > > >>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>> streaming
> >>> >>> > > > > > >>>>>>>>>>>>>> data that changes all
> >>> >>> > > > > > >>>>>>>>>>>>>> the time, so a single connector seems to
> be
> >>> an
> >>> >>> > > > > > >>>> overkill.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> State data is also static within a
> >>> savepoint and
> >>> >>> > > > > > >> that's
> >>> >>> > > > > > >>>>> the
> >>> >>> > > > > > >>>>>>>>> reason
> >>> >>> > > > > > >>>>>>>>>>> why
> >>> >>> > > > > > >>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> state processor API is working in batch
> >>> mode.
> >>> >>> > > > > > >>>>>>>>>>>>>> When we handle multiple checkpoints in a
> >>> >>> streaming
> >>> >>> > > > > > >>>> fashion
> >>> >>> > > > > > >>>>>>> then
> >>> >>> > > > > > >>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>> viewed from another angle.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> We can come up with more lightweight
> >>> solution
> >>> >>> other
> >>> >>> > > > > > >>>> than a
> >>> >>> > > > > > >>>>>>> new
> >>> >>> > > > > > >>>>>>>>>>>> connector
> >>> >>> > > > > > >>>>>>>>>>>>>> but enforcing users to parse the catalog
> >>> >>> > > > > > >>>>>>>>>>>>>> describe command output in order to
> compare
> >>> >>> multiple
> >>> >>> > > > > > >>>>>>> savepoints
> >>> >>> > > > > > >>>>>>>>>>> doesn't
> >>> >>> > > > > > >>>>>>>>>>>>>> sound smooth user experience.
> >>> >>> > > > > > >>>>>>>>>>>>>> Honestly I've no other idea how exposing
> >>> >>> metadata as
> >>> >>> > > > > > >>>> real
> >>> >>> > > > > > >>>>>>> user
> >>> >>> > > > > > >>>>>>>>> data
> >>> >>> > > > > > >>>>>>>>>>> so
> >>> >>> > > > > > >>>>>>>>>>>>>> waiting on other approaches.
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> On Thu, Mar 13, 2025 at 2:44 AM Shengkai
> >>> Fang <
> >>> >>> > > > > > >>>>>>>> fskm...@gmail.com
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>> Looking forward to hearing the good news!
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>> Gabor Somogyi <gabor.g.somo...@gmail.com
> >
> >>> >>> > > > > > >>>> 于2025年3月12日周三
> >>> >>> > > > > > >>>>>>>>> 22:24写道:
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>> Thanks for both the valuable input!
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>> Let me take a closer look at the
> >>> suggestions,
> >>> >>> > > > > > >> like
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>> Catalog
> >>> >>> > > > > > >>>>>>>>>>>>>>> capabilities
> >>> >>> > > > > > >>>>>>>>>>>>>>>> and possibility of embedding
> >>> TypeInformation
> >>> >>> or
> >>> >>> > > > > > >>>>>>>>>>>>>>>> StateDescriptor metadata directly into
> >>> the raw
> >>> >>> > > > > > >>>> state
> >>> >>> > > > > > >>>>>>>> files...
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 8:17 AM Shengkai
> >>> Fang
> >>> >>> <
> >>> >>> > > > > > >>>>>>>>>> fskm...@gmail.com
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> Thanks for Zakelly's clarification.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> +1 to delay the discussion about this.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> I’d like to share my perspective on the
> >>> State
> >>> >>> > > > > > >>>>> Catalog
> >>> >>> > > > > > >>>>>>>>>> proposal.
> >>> >>> > > > > > >>>>>>>>>>>>> While
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> introducing this capability is
> >>> beneficial,
> >>> >>> > > > > > >> there
> >>> >>> > > > > > >>>> is
> >>> >>> > > > > > >>>>> a
> >>> >>> > > > > > >>>>>>>>>> blocker:
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>> current
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateBackend architecture does not
> permit
> >>> >>> > > > > > >>>> operators
> >>> >>> > > > > > >>>>> to
> >>> >>> > > > > > >>>>>>>>> encode
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> TypeInformation into the state—it only
> >>> >>> > > > > > >> preserves
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>>>> Serializer.
> >>> >>> > > > > > >>>>>>>>>>>>> This
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> limitation creates an asymmetry, as
> >>> operators
> >>> >>> > > > > > >>>> alone
> >>> >>> > > > > > >>>>>>>> retain
> >>> >>> > > > > > >>>>>>>>>>>>> knowledge
> >>> >>> > > > > > >>>>>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> data structure’s schema.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> To address this, I suggest allowing
> >>> operators
> >>> >>> > > > > > >> to
> >>> >>> > > > > > >>>>> embed
> >>> >>> > > > > > >>>>>>>>>>>>>> TypeInformation
> >>> >>> > > > > > >>>>>>>>>>>>>>> or
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateDescriptor metadata directly into
> >>> the
> >>> >>> raw
> >>> >>> > > > > > >>>> state
> >>> >>> > > > > > >>>>>>>> files.
> >>> >>> > > > > > >>>>>>>>>>> Such
> >>> >>> > > > > > >>>>>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>>>>>> design
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> would enable the Catalog to:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. Parse state files and
> programmatically
> >>> >>> > > > > > >> derive
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>> schema
> >>> >>> > > > > > >>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>> structural
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> guarantees for each state.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Leverage existing Flink Table
> >>> utilities,
> >>> >>> > > > > > >> such
> >>> >>> > > > > > >>>> as
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> LegacyTypeInfoDataTypeConverter (in
> >>> >>> > > > > > >>>>>>>>>>>>>>> org.apache.flink.table.types.utils),
> >>> >>> > > > > > >>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> bridge TypeInformation and DataType
> >>> >>> > > > > > >> conversions.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> If we can not store the TypeInformation
> >>> or
> >>> >>> > > > > > >>>>>>>> StateDescriptor
> >>> >>> > > > > > >>>>>>>>>> into
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> raw
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> state files, I am +1 for this FLIP to
> use
> >>> >>> > > > > > >>>> metadata
> >>> >>> > > > > > >>>>>>> column
> >>> >>> > > > > > >>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>> retrieve
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> information.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> Zakelly Lan <zakelly....@gmail.com>
> >>> >>> > > > > > >>>> 于2025年3月12日周三
> >>> >>> > > > > > >>>>>>>> 12:43写道:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Hi Gabor and Shengkai,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Thanks for sharing your thoughts! This
> >>> is a
> >>> >>> > > > > > >>>> long
> >>> >>> > > > > > >>>>>>>>> discussion
> >>> >>> > > > > > >>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>> sorry
> >>> >>> > > > > > >>>>>>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the late reply (I'm busy catching up
> >>> with
> >>> >>> > > > > > >>>> release
> >>> >>> > > > > > >>>>>>> 2.0
> >>> >>> > > > > > >>>>>>>>> these
> >>> >>> > > > > > >>>>>>>>>>>>> days).
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Let me first clarify your thoughts to
> >>> ensure
> >>> >>> > > > > > >> I
> >>> >>> > > > > > >>>>>>>> understand
> >>> >>> > > > > > >>>>>>>>>>>>>> correctly.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> IIUC,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> there is no persistent configuration
> for
> >>> >>> > > > > > >> state
> >>> >>> > > > > > >>>> TTL
> >>> >>> > > > > > >>>>>>> in
> >>> >>> > > > > > >>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> checkpoint.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> While
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you can infer that TTL is enabled by
> >>> reading
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>>> serializer,
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> checkpoint
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> itself only stores the last access
> time
> >>> for
> >>> >>> > > > > > >>>> each
> >>> >>> > > > > > >>>>>>> value.
> >>> >>> > > > > > >>>>>>>>> So
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> only
> >>> >>> > > > > > >>>>>>>>>>>>>>>> thing
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> we can show is the last access time
> for
> >>> each
> >>> >>> > > > > > >>>>> value.
> >>> >>> > > > > > >>>>>>> But
> >>> >>> > > > > > >>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> required
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> for all state backends to store this,
> as
> >>> >>> they
> >>> >>> > > > > > >>>> may
> >>> >>> > > > > > >>>>>>>>> directly
> >>> >>> > > > > > >>>>>>>>>>>> store
> >>> >>> > > > > > >>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> expired time. This will also increase
> >>> the
> >>> >>> > > > > > >>>>>>> difficulty of
> >>> >>> > > > > > >>>>>>>>>>>>>>> implementation
> >>> >>> > > > > > >>>>>>>>>>>>>>>> &
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> maintenance.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> This once again reiterates the
> >>> importance of
> >>> >>> > > > > > >>>>> unified
> >>> >>> > > > > > >>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> checkpoints. I’m planning on adding
> >>> this,
> >>> >>> and
> >>> >>> > > > > > >>>> we
> >>> >>> > > > > > >>>>> may
> >>> >>> > > > > > >>>>>>>>>>>> collaborate
> >>> >>> > > > > > >>>>>>>>>>>>> on
> >>> >>> > > > > > >>>>>>>>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the future.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I'm not in favor of adding a new
> >>> connector
> >>> >>> > > > > > >> for
> >>> >>> > > > > > >>>>>>>> metadata.
> >>> >>> > > > > > >>>>>>>>>> The
> >>> >>> > > > > > >>>>>>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> more like one-time information instead
> >>> of a
> >>> >>> > > > > > >>>>>>> streaming
> >>> >>> > > > > > >>>>>>>>> data
> >>> >>> > > > > > >>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>> changes
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> all
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the time, so a single connector seems
> >>> to be
> >>> >>> > > > > > >> an
> >>> >>> > > > > > >>>>>>>> overkill.
> >>> >>> > > > > > >>>>>>>>> It
> >>> >>> > > > > > >>>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>>>>> easy
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> withdraw a connector if we have a
> better
> >>> >>> > > > > > >>>> solution
> >>> >>> > > > > > >>>>> in
> >>> >>> > > > > > >>>>>>>>>> future.
> >>> >>> > > > > > >>>>>>>>>>>> I'm
> >>> >>> > > > > > >>>>>>>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> familiar with current Catalog
> >>> capabilities,
> >>> >>> > > > > > >>>> and if
> >>> >>> > > > > > >>>>>>> it
> >>> >>> > > > > > >>>>>>>>> could
> >>> >>> > > > > > >>>>>>>>>>>>> extract
> >>> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> show some operator-level information
> >>> from
> >>> >>> > > > > > >>>>> savepoint,
> >>> >>> > > > > > >>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>> great.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> If the Catalog can't do that, I would
> >>> >>> > > > > > >> consider
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>> current
> >>> >>> > > > > > >>>>>>>>>>> FLIP
> >>> >>> > > > > > >>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>> be a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> compromise solution.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> And if we have that unified metadata
> for
> >>> >>> > > > > > >>>>>>>>>> checkpoint/savepoint
> >>> >>> > > > > > >>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>> future,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> may directly register savepoint in
> >>> catalog,
> >>> >>> > > > > > >> and
> >>> >>> > > > > > >>>>>>> create
> >>> >>> > > > > > >>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>> source
> >>> >>> > > > > > >>>>>>>>>>>>>>> without
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> specifying complex columns, as well as
> >>> >>> > > > > > >> describe
> >>> >>> > > > > > >>>>> the
> >>> >>> > > > > > >>>>>>>>>> savepoint
> >>> >>> > > > > > >>>>>>>>>>>>>> catalog
> >>> >>> > > > > > >>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> get the metadata. That's a good
> >>> solution in
> >>> >>> > > > > > >> my
> >>> >>> > > > > > >>>>> mind.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Zakelly
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 10:35 AM
> >>> Shengkai
> >>> >>> > > > > > >> Fang
> >>> >>> > > > > > >>>> <
> >>> >>> > > > > > >>>>>>>>>>>>> fskm...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> >>> > > > > > >>>>>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> I would argue against introducing a
> new
> >>> >>> > > > > > >>>>> connector
> >>> >>> > > > > > >>>>>>>> type
> >>> >>> > > > > > >>>>>>>>>>> named
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata, as the existing
> >>> Catalog
> >>> >>> > > > > > >>>>>>> mechanism
> >>> >>> > > > > > >>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>> inherently
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> provide the necessary connector
> factory
> >>> >>> > > > > > >>>>>>> capabilities.
> >>> >>> > > > > > >>>>>>>>>> I’ve
> >>> >>> > > > > > >>>>>>>>>>>>>> detailed
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposal in branch[1]. Please take a
> >>> moment
> >>> >>> > > > > > >>>> to
> >>> >>> > > > > > >>>>>>> review
> >>> >>> > > > > > >>>>>>>>> it.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> If we introduce a connector named
> >>> >>> > > > > > >>>>>>>> `savepoint-metadata`,
> >>> >>> > > > > > >>>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>>> means
> >>> >>> > > > > > >>>>>>>>>>>>>>> user
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> create a temporary table with
> connector
> >>> >>> > > > > > >>>>>>>>>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> connector needs to check whether
> table
> >>> >>> > > > > > >>>> schema is
> >>> >>> > > > > > >>>>>>> same
> >>> >>> > > > > > >>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> schema
> >>> >>> > > > > > >>>>>>>>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposed in the FLIP. On the other
> >>> hand,
> >>> >>> > > > > > >> it's
> >>> >>> > > > > > >>>>> not
> >>> >>> > > > > > >>>>>>>> easy
> >>> >>> > > > > > >>>>>>>>>> work
> >>> >>> > > > > > >>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>>> others
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users a metadata table with same
> >>> schema.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>>>>>> 于2025年3月11日周二
> >>> >>> > > > > > >>>>>>>>>>>>> 16:56写道:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> From directional perspective I agree
> >>> your
> >>> >>> > > > > > >>>> idea
> >>> >>> > > > > > >>>>>>> how
> >>> >>> > > > > > >>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> implemented.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Previously I've mentioned that TTL
> >>> >>> > > > > > >>>> information
> >>> >>> > > > > > >>>>>>> is
> >>> >>> > > > > > >>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>> exposed
> >>> >>> > > > > > >>>>>>>>>>>>>> on
> >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> processor API (which the SQL state
> >>> >>> > > > > > >>>> connector
> >>> >>> > > > > > >>>>>>> uses
> >>> >>> > > > > > >>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>> read
> >>> >>> > > > > > >>>>>>>>>>>>> data)
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> and unless somebody show me the
> >>> opposite
> >>> >>> > > > > > >>>> this
> >>> >>> > > > > > >>>>>>> FLIP
> >>> >>> > > > > > >>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>>> going
> >>> >>> > > > > > >>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> address
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> this to avoid feature creep. Our
> users
> >>> >>> > > > > > >> are
> >>> >>> > > > > > >>>>> also
> >>> >>> > > > > > >>>>>>>>>>> interested
> >>> >>> > > > > > >>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>> TTL
> >>> >>> > > > > > >>>>>>>>>>>>>>>> so
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sooner or later we're going to
> expose
> >>> it,
> >>> >>> > > > > > >>>> this
> >>> >>> > > > > > >>>>>>> is
> >>> >>> > > > > > >>>>>>>>>> matter
> >>> >>> > > > > > >>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> scheduling.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> >>> > > > > > >>>>>>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Not sure I understand your point at
> >>> all
> >>> >>> > > > > > >>>>> related
> >>> >>> > > > > > >>>>>>>>>>>> StateCatalog.
> >>> >>> > > > > > >>>>>>>>>>>>>>> First
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> all
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I can't agree more that StateCatalog
> >>> is
> >>> >>> > > > > > >>>> needed
> >>> >>> > > > > > >>>>>>> and
> >>> >>> > > > > > >>>>>>>>> is a
> >>> >>> > > > > > >>>>>>>>>>>>> planned
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> building
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> block in an upcoming
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> FLIP but not sure how can it help
> >>> now? No
> >>> >>> > > > > > >>>>> matter
> >>> >>> > > > > > >>>>>>>>> what,
> >>> >>> > > > > > >>>>>>>>>>> your
> >>> >>> > > > > > >>>>>>>>>>>>>>>> knowledge
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> essential when we add StateCatalog.
> >>> Let
> >>> >>> > > > > > >> me
> >>> >>> > > > > > >>>>>>> expose
> >>> >>> > > > > > >>>>>>>> my
> >>> >>> > > > > > >>>>>>>>>>>>>>> understanding
> >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> area:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * First we need create table
> >>> statements
> >>> >>> > > > > > >> to
> >>> >>> > > > > > >>>>>>> access
> >>> >>> > > > > > >>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>> data
> >>> >>> > > > > > >>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * When we have that then we can add
> >>> >>> > > > > > >>>>> StateCatalog
> >>> >>> > > > > > >>>>>>>>> which
> >>> >>> > > > > > >>>>>>>>>>>> could
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> potentially
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> ease the life of users by for ex.
> >>> giving
> >>> >>> > > > > > >>>>>>>>> off-the-shelf
> >>> >>> > > > > > >>>>>>>>>>>> tables
> >>> >>> > > > > > >>>>>>>>>>>>>>>> without
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sweating with create table
> statements
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> User expectations:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See state data (this is fulfilled
> >>> with
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>> existing
> >>> >>> > > > > > >>>>>>>>>>>>>> connector)
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about state data like
> >>> TTL
> >>> >>> > > > > > >>>> (this
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>> added
> >>> >>> > > > > > >>>>>>>>>>>>> as
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> column as you suggested since it
> >>> belongs
> >>> >>> > > > > > >> to
> >>> >>> > > > > > >>>>> the
> >>> >>> > > > > > >>>>>>>> data)
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about operators (this
> >>> can
> >>> >>> > > > > > >> be
> >>> >>> > > > > > >>>>>>> added
> >>> >>> > > > > > >>>>>>>>> from
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata)
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Important to highlight that state
> data
> >>> >>> > > > > > >>>> table
> >>> >>> > > > > > >>>>>>> format
> >>> >>> > > > > > >>>>>>>>>>> differs
> >>> >>> > > > > > >>>>>>>>>>>>>> from
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata table format. Namely one
> >>> table
> >>> >>> > > > > > >> has
> >>> >>> > > > > > >>>>> rows
> >>> >>> > > > > > >>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>>> values
> >>> >>> > > > > > >>>>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> another has rows for operators,
> right?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I think that's the reason why you've
> >>> >>> > > > > > >>>>> pinpointed
> >>> >>> > > > > > >>>>>>> out
> >>> >>> > > > > > >>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> suggested
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata columns are somewhat
> clunky.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> As a conclusion I agree to add
> >>> >>> > > > > > >>>>> ${state-name}_ttl
> >>> >>> > > > > > >>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>> column
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> later
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> on
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> since it belongs to the state value
> >>> and
> >>> >>> > > > > > >>>>> adding a
> >>> >>> > > > > > >>>>>>>> new
> >>> >>> > > > > > >>>>>>>>>>> table
> >>> >>> > > > > > >>>>>>>>>>>>> type
> >>> >>> > > > > > >>>>>>>>>>>>>>>> (like
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> suggested similar to PG [1])
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for metadata. Please see how Spark
> >>> does
> >>> >>> > > > > > >>>> that
> >>> >>> > > > > > >>>>> too
> >>> >>> > > > > > >>>>>>>> [2].
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> If you have better approach then
> >>> please
> >>> >>> > > > > > >>>>>>> elaborate
> >>> >>> > > > > > >>>>>>>>> with
> >>> >>> > > > > > >>>>>>>>>>> more
> >>> >>> > > > > > >>>>>>>>>>>>>>> details
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> help me to understand your point.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> >>> > > > > > >>>>> savepoints
> >>> >>> > > > > > >>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>> number
> >>> >>> > > > > > >>>>>>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> keys
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per
> key
> >>> >>> > > > > > >>>> state
> >>> >>> > > > > > >>>>>>>> itself.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >>> as-is
> >>> >>> > > > > > >>>> and
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>> handled
> >>> >>> > > > > > >>>>>>>>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> separate
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> jira.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I've just created
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >> https://issues.apache.org/jira/browse/FLINK-37456.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> https://www.postgresql.org/docs/current/view-pg-tables.html
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [2]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> On Tue, Mar 11, 2025 at 3:55 AM
> >>> Shengkai
> >>> >>> > > > > > >>>> Fang
> >>> >>> > > > > > >>>>> <
> >>> >>> > > > > > >>>>>>>>>>>>>> fskm...@gmail.com
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your
> response.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Thank you for addressing the
> >>> >>> > > > > > >> limitations
> >>> >>> > > > > > >>>>> here.
> >>> >>> > > > > > >>>>>>>>>>> However, I
> >>> >>> > > > > > >>>>>>>>>>>>>>> believe
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be beneficial to further clarify
> the
> >>> >>> > > > > > >> API
> >>> >>> > > > > > >>>> in
> >>> >>> > > > > > >>>>>>> this
> >>> >>> > > > > > >>>>>>>>> FLIP
> >>> >>> > > > > > >>>>>>>>>>>>>> regarding
> >>> >>> > > > > > >>>>>>>>>>>>>>>> how
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> can specify the TTL column.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> One potential approach that comes
> to
> >>> >>> > > > > > >>>> mind is
> >>> >>> > > > > > >>>>>>>> using
> >>> >>> > > > > > >>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>>>>>> standardized
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> naming
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> convention such as
> ${state-name}_ttl
> >>> >>> > > > > > >> for
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>> column
> >>> >>> > > > > > >>>>>>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> defines
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the TTL value. In terms of
> >>> >>> > > > > > >>>> implementation,
> >>> >>> > > > > > >>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>> listReadableMetadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> function could:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Read the table’s columns and
> >>> >>> > > > > > >>>>> configuration,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Extract all defined state names,
> >>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 3. Return a structured list of
> >>> metadata
> >>> >>> > > > > > >>>>>>> entries
> >>> >>> > > > > > >>>>>>>>>>> formatted
> >>> >>> > > > > > >>>>>>>>>>>>> as
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> ${state-name}_ttl.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> WDYT?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>> >>> > > > > > >>>>>>>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Introducing a new connector type at
> >>> >>> > > > > > >> this
> >>> >>> > > > > > >>>>> stage
> >>> >>> > > > > > >>>>>>>> may
> >>> >>> > > > > > >>>>>>>>>>>>>>> unnecessarily
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> complicate
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the system. Given that every table
> >>> >>> > > > > > >>>> already
> >>> >>> > > > > > >>>>>>>> belongs
> >>> >>> > > > > > >>>>>>>>>> to a
> >>> >>> > > > > > >>>>>>>>>>>>>>> Catalog,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> which
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> designed to provide a Factory for
> >>> >>> > > > > > >>>> building
> >>> >>> > > > > > >>>>>>> source
> >>> >>> > > > > > >>>>>>>>> or
> >>> >>> > > > > > >>>>>>>>>>> sink
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> connectors, I
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> propose integrating a dedicated
> >>> >>> > > > > > >>>> StateCatalog
> >>> >>> > > > > > >>>>>>>>> instead.
> >>> >>> > > > > > >>>>>>>>>>>> This
> >>> >>> > > > > > >>>>>>>>>>>>>>>> approach
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> allow us to:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Leverage the Catalog’s existing
> >>> >>> > > > > > >>>>>>> capabilities
> >>> >>> > > > > > >>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>> manage
> >>> >>> > > > > > >>>>>>>>>>>>> TTL
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> (e.g., state names and TTL logic)
> >>> >>> > > > > > >> without
> >>> >>> > > > > > >>>>>>>>> duplicating
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> functionality.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Provide a unified interface for
> >>> >>> > > > > > >>>> connector
> >>> >>> > > > > > >>>>>>>>>>>> instantiation
> >>> >>> > > > > > >>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handling through the Catalog’s
> >>> Factory
> >>> >>> > > > > > >>>>>>> pattern.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Would this design decision better
> >>> align
> >>> >>> > > > > > >>>> with
> >>> >>> > > > > > >>>>>>> our
> >>> >>> > > > > > >>>>>>>>>>>>>> architecture’s
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> extensibility and reduce
> redundancy?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> >>> > > > > > >>>>>>> savepoints
> >>> >>> > > > > > >>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> number
> >>> >>> > > > > > >>>>>>>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per
> >>> key
> >>> >>> > > > > > >>>>> state
> >>> >>> > > > > > >>>>>>>>> itself.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >>> >>> > > > > > >> as-is
> >>> >>> > > > > > >>>>> and
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>> handled
> >>> >>> > > > > > >>>>>>>>>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> +1 for a separate jira.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>> >>> > > > > > >> gabor.g.somo...@gmail.com
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>>>>>>>> 于2025年3月10日周一
> >>> >>> > > > > > >>>>>>>>>>>>>>> 19:05写道:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Please see my comments inline.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 3, 2025 at 7:07 AM
> >>> >>> > > > > > >> Shengkai
> >>> >>> > > > > > >>>>>>> Fang <
> >>> >>> > > > > > >>>>>>>>>>>>>>>> fskm...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your the
> >>> >>> > > > > > >> FLIP.
> >>> >>> > > > > > >>>> I
> >>> >>> > > > > > >>>>>>> have
> >>> >>> > > > > > >>>>>>>>> some
> >>> >>> > > > > > >>>>>>>>>>>>>> questions
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> about
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> FLIP:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> How can users retrieve the state
> >>> >>> > > > > > >> TTL
> >>> >>> > > > > > >>>>>>>>>> (Time-to-Live)
> >>> >>> > > > > > >>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>> each
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> value
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> column?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> From my understanding of the
> >>> >>> > > > > > >> current
> >>> >>> > > > > > >>>>>>> design,
> >>> >>> > > > > > >>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>> seems
> >>> >>> > > > > > >>>>>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> functionality is not supported.
> >>> >>> > > > > > >> Could
> >>> >>> > > > > > >>>>> you
> >>> >>> > > > > > >>>>>>>>> clarify
> >>> >>> > > > > > >>>>>>>>>>> if
> >>> >>> > > > > > >>>>>>>>>>>>>> there
> >>> >>> > > > > > >>>>>>>>>>>>>>>> are
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> plans
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> address this limitation?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Since the state processor API is
> not
> >>> >>> > > > > > >>>> yet
> >>> >>> > > > > > >>>>>>>> exposing
> >>> >>> > > > > > >>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> information
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> would require several steps.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> First, the state processor API
> >>> >>> > > > > > >> support
> >>> >>> > > > > > >>>>>>> needs to
> >>> >>> > > > > > >>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>> added
> >>> >>> > > > > > >>>>>>>>>>>>>>> which
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> then
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exposed on the SQL API.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This is definitely a future
> >>> >>> > > > > > >> improvement
> >>> >>> > > > > > >>>>>>> which
> >>> >>> > > > > > >>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>> useful
> >>> >>> > > > > > >>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handled
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> in a separate jira.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata
> >>> >>> > > > > > >> Column
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> The metadata information
> described
> >>> >>> > > > > > >> in
> >>> >>> > > > > > >>>>> the
> >>> >>> > > > > > >>>>>>>> FLIP
> >>> >>> > > > > > >>>>>>>>>>>> appears
> >>> >>> > > > > > >>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> intended
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> describe the state files stored
> at
> >>> >>> > > > > > >> a
> >>> >>> > > > > > >>>>>>> specific
> >>> >>> > > > > > >>>>>>>>>>>> location.
> >>> >>> > > > > > >>>>>>>>>>>>>> To
> >>> >>> > > > > > >>>>>>>>>>>>>>>> me,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> concept
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> aligns more closely with system
> >>> >>> > > > > > >>>> tables
> >>> >>> > > > > > >>>>>>> like
> >>> >>> > > > > > >>>>>>>>>>> pg_tables
> >>> >>> > > > > > >>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> PostgreSQL
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> or
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> the INFORMATION_SCHEMA in MySQL
> >>> >>> > > > > > >> [2].
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Adding a new connector with
> >>> >>> > > > > > >>>>>>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>> is a
> >>> >>> > > > > > >>>>>>>>>>>>>>>> possibility
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> where
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> can create such functionality.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> I'm not against that, just want to
> >>> >>> > > > > > >>>> have a
> >>> >>> > > > > > >>>>>>>> common
> >>> >>> > > > > > >>>>>>>>>>>>> agreement
> >>> >>> > > > > > >>>>>>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> we
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> like to move that direction.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (As a side note not just PG but
> >>> Spark
> >>> >>> > > > > > >>>> also
> >>> >>> > > > > > >>>>>>> has
> >>> >>> > > > > > >>>>>>>>>>> similar
> >>> >>> > > > > > >>>>>>>>>>>>>>> approach
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> and I
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> basically like the idea).
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would go that direction
> >>> >>> > > > > > >> savepoint
> >>> >>> > > > > > >>>>>>>> metadata
> >>> >>> > > > > > >>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>> reached
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> way
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> that one row would represent
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> an operator with it's values
> >>> >>> > > > > > >> something
> >>> >>> > > > > > >>>>> like
> >>> >>> > > > > > >>>>>>>> this:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ame      │id       │ash      │sm
> >>> >>> > > > > > >>>>>>> │elism
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │atesCount│orStateSi│tesSizeI│
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │         │
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │zeInBytes│nBytes  │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │Source:  │datagen-s│47aee9439│2
> >>> >>> > > > > > >>>>> │128
> >>> >>> > > > > > >>>>>>>>>> │2
> >>> >>> > > > > > >>>>>>>>>>>>>>> │16
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │546     │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │datagen-s│ource-uid│4d6ea26e2│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ource    │         │d544bef0a│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │37bb5    │
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │long-udf-│long-udf-│6ed3f40bf│2
> >>> >>> > > > > > >>>>> │128
> >>> >>> > > > > > >>>>>>>>>> │2
> >>> >>> > > > > > >>>>>>>>>>>>>>> │0
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> │0
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>     │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │with-mast│with-mast│f3c8dfcdf│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │er-hook  │er-hook-u│cb95128a1│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │id       │018f1    │
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │value-pro│value-pro│ca4f5fe9a│2
> >>> >>> > > > > > >>>>> │128
> >>> >>> > > > > > >>>>>>>>>> │2
> >>> >>> > > > > > >>>>>>>>>>>>>>> │0
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │40726   │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │cess     │cess-uid │637b656f0│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │9ea78b3e7│
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │a15b9    │
> >>> >>> > > > > > >>>> │
> >>> >>> > > > > > >>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This table can then be joined with
> >>> >>> > > > > > >> the
> >>> >>> > > > > > >>>>>>> actually
> >>> >>> > > > > > >>>>>>>>>>>> existing
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> `savepoint`
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> connector created tables based on
> >>> UID
> >>> >>> > > > > > >>>> hash
> >>> >>> > > > > > >>>>>>>> (which
> >>> >>> > > > > > >>>>>>>>>> is
> >>> >>> > > > > > >>>>>>>>>>>>> unique
> >>> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> always
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exists).
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This would mean that the already
> >>> >>> > > > > > >>>> existing
> >>> >>> > > > > > >>>>>>> table
> >>> >>> > > > > > >>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>> need
> >>> >>> > > > > > >>>>>>>>>>>>>>>> only a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> single
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> metadata column which is the UID
> >>> >>> > > > > > >> hash.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> @zakelly, plz share your thoughts
> >>> >>> > > > > > >> too.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> If we opt to use metadata
> columns,
> >>> >>> > > > > > >>>> every
> >>> >>> > > > > > >>>>>>>> record
> >>> >>> > > > > > >>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>> table
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> would
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> end
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> up
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> having identical values for these
> >>> >>> > > > > > >>>>> columns
> >>> >>> > > > > > >>>>>>>>> (please
> >>> >>> > > > > > >>>>>>>>>>>>> correct
> >>> >>> > > > > > >>>>>>>>>>>>>>> me
> >>> >>> > > > > > >>>>>>>>>>>>>>>> if
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I’m
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> mistaken). On the other hand, the
> >>> >>> > > > > > >>>> state
> >>> >>> > > > > > >>>>>>>>> connector
> >>> >>> > > > > > >>>>>>>>>>>>>> requires
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> users
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> specify
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> an operator UID or operator UID
> >>> >>> > > > > > >> hash,
> >>> >>> > > > > > >>>>>>> after
> >>> >>> > > > > > >>>>>>>>> which
> >>> >>> > > > > > >>>>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>>>>> outputs
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> user-defined
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> values in its records. This
> >>> >>> > > > > > >> approach
> >>> >>> > > > > > >>>>> feels
> >>> >>> > > > > > >>>>>>>>>> somewhat
> >>> >>> > > > > > >>>>>>>>>>>>>>> redundant
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> me.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would add a new
> >>> >>> > > > > > >>>> `savepoint-metadata`
> >>> >>> > > > > > >>>>>>>>>> connector
> >>> >>> > > > > > >>>>>>>>>>>> then
> >>> >>> > > > > > >>>>>>>>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> addressed.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On the other hand UID and UID hash
> >>> >>> > > > > > >> are
> >>> >>> > > > > > >>>>>>> having
> >>> >>> > > > > > >>>>>>>>>>> either-or
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> relationship
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> from
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> config perspective,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> so when a user provides the UID
> then
> >>> >>> > > > > > >>>>> he/she
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>>> interested
> >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> hash
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> for further calculations
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (the whole Flink internals are
> >>> >>> > > > > > >>>> depending
> >>> >>> > > > > > >>>>> on
> >>> >>> > > > > > >>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>> hash).
> >>> >>> > > > > > >>>>>>>>>>>>>>> Printing
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> out
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> human readable UID
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> is an explicit requirement from
> the
> >>> >>> > > > > > >>>> user
> >>> >>> > > > > > >>>>>>> side
> >>> >>> > > > > > >>>>>>>>>> because
> >>> >>> > > > > > >>>>>>>>>>>>>> hashes
> >>> >>> > > > > > >>>>>>>>>>>>>>>> are
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> not
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> human
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> readable.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 3. Handling LIST and MAP States
> in
> >>> >>> > > > > > >>>> the
> >>> >>> > > > > > >>>>>>> State
> >>> >>> > > > > > >>>>>>>>>>>> Connector
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> I have concerns about how the
> >>> >>> > > > > > >> current
> >>> >>> > > > > > >>>>>>> design
> >>> >>> > > > > > >>>>>>>>>>> handles
> >>> >>> > > > > > >>>>>>>>>>>>> LIST
> >>> >>> > > > > > >>>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> MAP
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> states.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Specifically, the state connector
> >>> >>> > > > > > >>>> uses
> >>> >>> > > > > > >>>>>>> Flink
> >>> >>> > > > > > >>>>>>>>>> SQL’s
> >>> >>> > > > > > >>>>>>>>>>>> MAP
> >>> >>> > > > > > >>>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> ARRAY
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> types,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> which implies that it attempts to
> >>> >>> > > > > > >>>> load
> >>> >>> > > > > > >>>>>>> entire
> >>> >>> > > > > > >>>>>>>>> MAP
> >>> >>> > > > > > >>>>>>>>>>> or
> >>> >>> > > > > > >>>>>>>>>>>>> LIST
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> states
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> into
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> memory.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> However, in many real-world
> >>> >>> > > > > > >>>> scenarios,
> >>> >>> > > > > > >>>>>>> these
> >>> >>> > > > > > >>>>>>>>>> states
> >>> >>> > > > > > >>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>> grow
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> very
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> large.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Typically, the state API
> addresses
> >>> >>> > > > > > >>>> this
> >>> >>> > > > > > >>>>> by
> >>> >>> > > > > > >>>>>>>>>>> providing
> >>> >>> > > > > > >>>>>>>>>>>> an
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> iterator
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> traverse elements within the
> state
> >>> >>> > > > > > >>>>>>>>> incrementally.
> >>> >>> > > > > > >>>>>>>>>>> I’m
> >>> >>> > > > > > >>>>>>>>>>>>>>> unsure
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> whether
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> I’ve
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> missed something in FLIP-496 or
> >>> >>> > > > > > >>>>> FLIP-512,
> >>> >>> > > > > > >>>>>>> but
> >>> >>> > > > > > >>>>>>>>> it
> >>> >>> > > > > > >>>>>>>>>>>> seems
> >>> >>> > > > > > >>>>>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> current
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> design might struggle with
> >>> >>> > > > > > >>>> scalability
> >>> >>> > > > > > >>>>> in
> >>> >>> > > > > > >>>>>>>> such
> >>> >>> > > > > > >>>>>>>>>>> cases.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> You see it good, the current
> >>> >>> > > > > > >>>>> implementation
> >>> >>> > > > > > >>>>>>>> keeps
> >>> >>> > > > > > >>>>>>>>>>> state
> >>> >>> > > > > > >>>>>>>>>>>>>> for a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> single
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> key
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> in
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> memory.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Back in the days we've considered
> >>> >>> > > > > > >> this
> >>> >>> > > > > > >>>>>>>> potential
> >>> >>> > > > > > >>>>>>>>>>> issue
> >>> >>> > > > > > >>>>>>>>>>>>> and
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> concluded
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> this is not necessarily
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> needed for the initial version and
> >>> >>> > > > > > >> can
> >>> >>> > > > > > >>>> be
> >>> >>> > > > > > >>>>>>> done
> >>> >>> > > > > > >>>>>>>>> as a
> >>> >>> > > > > > >>>>>>>>>>>> later
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> improvement.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>> >>> > > > > > >>>>>>> savepoints
> >>> >>> > > > > > >>>>>>>>> that
> >>> >>> > > > > > >>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>> number
> >>> >>> > > > > > >>>>>>>>>>>>>>>> of
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per
> >>> key
> >>> >>> > > > > > >>>>> state
> >>> >>> > > > > > >>>>>>>>> itself.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >>> >>> > > > > > >> as-is
> >>> >>> > > > > > >>>>> and
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>>>> handled
> >>> >>> > > > > > >>>>>>>>>>>>>>> in a
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Shengkai
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > https://www.postgresql.org/docs/current/view-pg-tables.html
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [2]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>> >>> > > > > > >>>>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>> 于2025年3月3日周一
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> 02:00写道:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> Hi Zakelly,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> In order to shoot for simplicity
> >>> >>> > > > > > >>>>>>> `METADATA
> >>> >>> > > > > > >>>>>>>>>>> VIRTUAL`
> >>> >>> > > > > > >>>>>>>>>>>>> as
> >>> >>> > > > > > >>>>>>>>>>>>>>> key
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>> words
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> definition is the target.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> When it's not super complex the
> >>> >>> > > > > > >>>> latter
> >>> >>> > > > > > >>>>>>> can
> >>> >>> > > > > > >>>>>>>> be
> >>> >>> > > > > > >>>>>>>>>>> added
> >>> >>> > > > > > >>>>>>>>>>>>>> too.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 2, 2025 at 3:37 PM
> >>> >>> > > > > > >>>> Zakelly
> >>> >>> > > > > > >>>>>>> Lan
> >>> >>> > > > > > >>>>>>>> <
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> zakelly....@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> +1 for this.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Will the metadata column use
> >>> >>> > > > > > >>>>> `METADATA
> >>> >>> > > > > > >>>>>>>>>> VIRTUAL`
> >>> >>> > > > > > >>>>>>>>>>>> as
> >>> >>> > > > > > >>>>>>>>>>>>>> key
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> words
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> definition, or `METADATA FROM
> >>> >>> > > > > > >> xxx
> >>> >>> > > > > > >>>>>>>> VIRTUAL`
> >>> >>> > > > > > >>>>>>>>>> for
> >>> >>> > > > > > >>>>>>>>>>>>>>> renaming,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> just
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> like
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> the
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Kafka table?
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Zakelly
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Mar 1, 2025 at 1:31 PM
> >>> >>> > > > > > >>>> Gabor
> >>> >>> > > > > > >>>>>>>>> Somogyi
> >>> >>> > > > > > >>>>>>>>>> <
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> gabor.g.somo...@gmail.com>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi All,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to start a
> >>> >>> > > > > > >> discussion
> >>> >>> > > > > > >>>> of
> >>> >>> > > > > > >>>>>>>>> FLIP-512:
> >>> >>> > > > > > >>>>>>>>>>> Add
> >>> >>> > > > > > >>>>>>>>>>>>>> meta
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> information
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> SQL
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> state connector [1].
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Feel free to add your
> >>> >>> > > > > > >> thoughts
> >>> >>> > > > > > >>>> to
> >>> >>> > > > > > >>>>>>> make
> >>> >>> > > > > > >>>>>>>>> this
> >>> >>> > > > > > >>>>>>>>>>>>> feature
> >>> >>> > > > > > >>>>>>>>>>>>>>>>> better.
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> G
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>>
> >>> >>> > > > > > >>>>>>>>>
> >>> >>> > > > > > >>>>>>>>
> >>> >>> > > > > > >>>>>>>
> >>> >>> > > > > > >>>>>>
> >>> >>> > > > > > >>>>>
> >>> >>> > > > > > >>>>
> >>> >>> > > > > > >>>
> >>> >>> > > > > > >>
> >>> >>> > > > > >
> >>> >>> > > > > >
> >>> >>> > > > >
> >>> >>> > > >
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>> >>
> >>>
> >>
>

Reply via email to