Hi Gabor,

Only one minor suggestion, is it possible to make this \@Experimental?  I
could expect some new metadata columns added or changed in the near future,
just want to maintain some flexibility until the feature reaches maturity.


Best,
Zakelly

On Fri, Mar 28, 2025 at 8:47 PM Gabor Somogyi <gabor.g.somo...@gmail.com>
wrote:

> Just to avoid jumping from one thread to another here's Timo's suggestion:
>
> "if I understand the discussion correctly, you want to use a PTF without
> table arguments to return a table (read from savepoint metadata)? If
> this is the case, you don't need a PTF for it. A regular table function
> can also do the job. IIRC we support TVF with constant args."
>
> I've tried the TVF out and works with batch mode like charm.
>
> > I'd also suggest we make it built-in without registration.
> I basically agree to have this as built-in.
>
> From my perspective there are no further questions/concerns and updated the
> FLIP accordingly.
> If somebody has then please share, otherwise I would like to go on with the
> vote.
>
> All in all thanks for everybody for the constructive suggestions, we've
> made things really better.
>
> BR,
> G
>
>
> On Fri, Mar 28, 2025 at 9:19 AM Zakelly Lan <zakelly....@gmail.com> wrote:
>
> > Hi all,
> >
> > Given the simplicity, I also +1 for PTF or any other function
> > implementation if PTF is not applicable for this.
> >
> > I would like to raise a consideration regarding the usage implementation:
> > > Would it be necessary to allow users to utilize the CREATE FUNCTION
> > > statement for registering the PTF?
> >
> >
> >  I'd also suggest we make it built-in without registration.
> >
> > Currently, Flink SQL supports letting external systems register modules
> and
> > > leverage these modules to centrally manage all function definitions.
> > Given
> > > this architectural approach, I’m curious if the plan involves
> introducing
> > > additional functions in the future. If so, I would advocate for
> > introducing
> > > a dedicated state module to centralize such management. This would
> > empower
> > > users to:
> >
> >
> > I can’t think of any further functions for now, but I'd +1 for a module
> if
> > it could omit the registration.
> >
> >
> > Best,
> > Zakelly.
> >
> >
> >
> > On Fri, Mar 28, 2025 at 10:25 AM Shengkai Fang <fskm...@gmail.com>
> wrote:
> >
> > > One more question about the FLIP.
> > >
> > > I think the output schema is definitely a public API to users. If users
> > > use the `CREATE FUNCTION` statement, is it means the class path is
> also a
> > > public API to users. Alternatively, this is merely an experimental
> > feature
> > > and we don't have any promise about this function.
> > >
> > > Best,
> > > Shengkai
> > >
> > > Shengkai Fang <fskm...@gmail.com> 于2025年3月28日周五 10:20写道:
> > >
> > >> +1 to use PTF.
> > >>
> > >> I would like to raise a consideration regarding the usage
> > implementation:
> > >> Would it be necessary to allow users to utilize the CREATE FUNCTION
> > >> statement for registering the PTF?
> > >>
> > >> Currently, Flink SQL supports letting external systems register
> modules
> > >> and leverage these modules to centrally manage all function
> definitions.
> > >> Given this architectural approach, I’m curious if the plan involves
> > >> introducing additional functions in the future. If so, I would
> advocate
> > for
> > >> introducing a dedicated state module to centralize such management.
> This
> > >> would empower users to:
> > >>
> > >> 1. Simply execute the LOAD MODULE command to load the required module,
> > and
> > >> 2. Directly invoke read_metadata thereafter.
> > >>
> > >> For more details about the module, please refer to this document[1].
> > >>
> > >> Best,
> > >> Shengkai
> > >>
> > >> [1]
> > >>
> >
> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/modules/
> > >>
> > >> Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月28日周五 00:26写道:
> > >>
> > >>> Just found out that PTF in batch mode is not supported, plz see the
> dev
> > >>> mailing about it [1].
> > >>>
> > >>> [1] https://lists.apache.org/thread/ytm9m1qt4pq2q2gjngfktrn8vrlvkf07
> > >>>
> > >>> BR,
> > >>> G
> > >>>
> > >>>
> > >>> On Thu, Mar 27, 2025 at 3:38 PM Gabor Somogyi <
> > gabor.g.somo...@gmail.com
> > >>> >
> > >>> wrote:
> > >>>
> > >>> > In the meantime I've just updated the FLIP according to this to be
> > >>> > optimistic 🙂
> > >>> >
> > >>> > BR,
> > >>> > G
> > >>> >
> > >>> > On Thu, Mar 27, 2025 at 2:15 PM Gabor Somogyi <
> > >>> gabor.g.somo...@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> Considering all the facts I also +1 on PTF. Even if something is
> > >>> missing
> > >>> >> we can add later.
> > >>> >>
> > >>> >> @Zakelly Lan <zakelly....@gmail.com> @Shengkai Fang are you also
> on
> > >>> the
> > >>> >> same page or have something to add?
> > >>> >>
> > >>> >> BR,
> > >>> >> G
> > >>> >>
> > >>> >>
> > >>> >> On Thu, Mar 27, 2025 at 1:50 PM Lincoln Lee <
> lincoln.8...@gmail.com
> > >
> > >>> >> wrote:
> > >>> >>
> > >>> >>> +1 for PTF
> > >>> >>>
> > >>> >>> > Is it possible to describe such function to see the column
> > >>> names/types?
> > >>> >>>
> > >>> >>> Although Flink SQL does not directly support this feature, users
> > can
> > >>> >>> achieve
> > >>> >>> similar results with the help of `explain` syntax, e.g.
> > >>> >>> 'explain select * from read_state_metadata(...)'
> > >>> >>>
> > >>> >>>
> > >>> >>> Best,
> > >>> >>> Lincoln Lee
> > >>> >>>
> > >>> >>>
> > >>> >>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月27日周四 20:41写道:
> > >>> >>>
> > >>> >>> > Hey!
> > >>> >>> >
> > >>> >>> > I think the PTF approach strikes a great balance in simplicity
> > and
> > >>> the
> > >>> >>> > capabilities that we get out of it.
> > >>> >>> >
> > >>> >>> > I think this could be a completely viable alternative to the
> > >>> dedicated
> > >>> >>> > connector, +1.
> > >>> >>> >
> > >>> >>> > Cheers,
> > >>> >>> > Gyula
> > >>> >>> >
> > >>> >>> > On Thu, Mar 27, 2025 at 10:37 AM Shengkai Fang <
> > fskm...@gmail.com>
> > >>> >>> wrote:
> > >>> >>> >
> > >>> >>> > > Hi, Gabor.
> > >>> >>> > >
> > >>> >>> > > > Do I understand correctly that this is 2.x only feature and
> > we
> > >>> >>> can't
> > >>> >>> > > backport it to 1.x line
> > >>> >>> > >
> > >>> >>> > > Yes. PTF is only supported in 2.x verison.
> > >>> >>> > >
> > >>> >>> > > > Is it possible to describe such function to see the column
> > >>> >>> names/types?
> > >>> >>> > >
> > >>> >>> > > Flink SQL doesn't support this feature, but postgres[2] or
> > >>> mysql[1]
> > >>> >>> has
> > >>> >>> > > similar feature.
> > >>> >>> > >
> > >>> >>> > > [1]
> > >>> >>>
> https://dev.mysql.com/doc/refman/8.4/en/show-create-procedure.html
> > >>> >>> > > [2]
> > >>> >>> > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://stackoverflow.com/questions/6898453/show-the-code-of-a-function-procedure-and-trigger-in-postgresql
> > >>> >>> > >
> > >>> >>> > > Best,
> > >>> >>> > > Shengkai
> > >>> >>> > >
> > >>> >>> > >
> > >>> >>> > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月27日周四
> > 16:25写道:
> > >>> >>> > >
> > >>> >>> > > > Hi Shengkai,
> > >>> >>> > > >
> > >>> >>> > > > Thanks for your effort with the example, this looks
> > promising.
> > >>> >>> > > > I like the fact that users wouldn't need to sweat with
> > complex
> > >>> >>> create
> > >>> >>> > > table
> > >>> >>> > > > statements.
> > >>> >>> > > >
> > >>> >>> > > > Couple of questions:
> > >>> >>> > > > * Do I understand correctly that this is 2.x only feature
> and
> > >>> we
> > >>> >>> can't
> > >>> >>> > > > backport it to 1.x line?
> > >>> >>> > > > I'm not intended to do any backport, just would like to
> know
> > >>> the
> > >>> >>> > > technical
> > >>> >>> > > > constraints.
> > >>> >>> > > > * Is it possible to describe such function to see the
> column
> > >>> >>> > names/types?
> > >>> >>> > > >
> > >>> >>> > > > BR,
> > >>> >>> > > > G
> > >>> >>> > > >
> > >>> >>> > > >
> > >>> >>> > > > On Thu, Mar 27, 2025 at 3:17 AM Shengkai Fang <
> > >>> fskm...@gmail.com>
> > >>> >>> > wrote:
> > >>> >>> > > >
> > >>> >>> > > > > Many thanks for your reminder, Leonard. Here's the link I
> > >>> >>> > mentioned[1].
> > >>> >>> > > > >
> > >>> >>> > > > > Best,
> > >>> >>> > > > > Shengkai
> > >>> >>> > > > >
> > >>> >>> > > > > [1] https://github.com/apache/flink/pull/26358
> > >>> >>> > > > >
> > >>> >>> > > > > Leonard Xu <xbjt...@gmail.com> 于2025年3月27日周四 10:05写道:
> > >>> >>> > > > >
> > >>> >>> > > > > > Your link is broken, Shengkai
> > >>> >>> > > > > >
> > >>> >>> > > > > > Best,
> > >>> >>> > > > > > Leonard
> > >>> >>> > > > > >
> > >>> >>> > > > > > > 2025年3月27日 10:01,Shengkai Fang <fskm...@gmail.com>
> 写道:
> > >>> >>> > > > > > >
> > >>> >>> > > > > > > Hi, All.
> > >>> >>> > > > > > >
> > >>> >>> > > > > > > I write a simple demo to illustrate my idea. Hope
> this
> > >>> helps.
> > >>> >>> > > > > > >
> > >>> >>> > > > > > > Best,
> > >>> >>> > > > > > > Shengkai
> > >>> >>> > > > > > >
> > >>> >>> > > > > > >
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://github.com/apache/flink/compare/master...fsk119:flink:example?expand=1
> > >>> >>> > > > > > >
> > >>> >>> > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com>
> > 于2025年3月26日周三
> > >>> >>> 15:54写道:
> > >>> >>> > > > > > >
> > >>> >>> > > > > > >>> I'm fine with a seperate SQL connector for
> metadata,
> > so
> > >>> >>> maybe
> > >>> >>> > we
> > >>> >>> > > > > could
> > >>> >>> > > > > > >> update the FLIP about our discussion?
> > >>> >>> > > > > > >>
> > >>> >>> > > > > > >> Sorry, I've forgotten this part. Yeah, no matter we
> > >>> choose
> > >>> >>> I'm
> > >>> >>> > > going
> > >>> >>> > > > > to
> > >>> >>> > > > > > >> update the FLIP.
> > >>> >>> > > > > > >>
> > >>> >>> > > > > > >> G
> > >>> >>> > > > > > >>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > > >> On Wed, Mar 26, 2025 at 8:51 AM Gabor Somogyi <
> > >>> >>> > > > > > gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >> wrote:
> > >>> >>> > > > > > >>
> > >>> >>> > > > > > >>> Hi All,
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>> I've also lack of the knowledge of PTF so I've read
> > >>> just
> > >>> >>> the
> > >>> >>> > > > > motivation
> > >>> >>> > > > > > >>> part:
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>> "The SQL 2016 standard introduced a way of defining
> > >>> custom
> > >>> >>> SQL
> > >>> >>> > > > > > operators
> > >>> >>> > > > > > >>> defined by ISO/IEC 19075-7:2021 (Part 7:
> Polymorphic
> > >>> table
> > >>> >>> > > > > functions).
> > >>> >>> > > > > > >>> ~200 pages define how this new kind of function can
> > >>> >>> consume and
> > >>> >>> > > > > produce
> > >>> >>> > > > > > >>> tables with various execution properties.
> > >>> >>> > > > > > >>> Unfortunately, this part of the standard is not
> > >>> publicly
> > >>> >>> > > > available."
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>> Of course we can take a look at some examples but
> do
> > we
> > >>> >>> really
> > >>> >>> > > want
> > >>> >>> > > > > to
> > >>> >>> > > > > > >>> expose state data with this construct
> > >>> >>> > > > > > >>> which is described in ~200 pages and part of the
> > >>> standard
> > >>> >>> is
> > >>> >>> > not
> > >>> >>> > > > > > publicly
> > >>> >>> > > > > > >>> available? 🙂
> > >>> >>> > > > > > >>> I mean the dataset is couple of rows and the
> use-case
> > >>> is
> > >>> >>> join
> > >>> >>> > > with
> > >>> >>> > > > > > >> another
> > >>> >>> > > > > > >>> table like with state data.
> > >>> >>> > > > > > >>> If somebody can give advantages I would buy that
> but
> > >>> from
> > >>> >>> my
> > >>> >>> > > > limited
> > >>> >>> > > > > > >>> understanding this would be an overkill here.
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>> BR,
> > >>> >>> > > > > > >>> G
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>> On Wed, Mar 26, 2025 at 8:28 AM Gyula Fóra <
> > >>> >>> > gyula.f...@gmail.com
> > >>> >>> > > >
> > >>> >>> > > > > > wrote:
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>>> Hi Zakelly , Shengkai!
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>> I don't know too much about PTFs, it would be
> > >>> interesting
> > >>> >>> to
> > >>> >>> > see
> > >>> >>> > > > how
> > >>> >>> > > > > > the
> > >>> >>> > > > > > >>>> usage would look in practice.
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>> Do you have some mockup/example in mind how the
> PTF
> > >>> would
> > >>> >>> look
> > >>> >>> > > for
> > >>> >>> > > > > > >> example
> > >>> >>> > > > > > >>>> when want to:
> > >>> >>> > > > > > >>>> - Simply display/aggregate whats in the metadata
> > >>> >>> > > > > > >>>> - Join keyed state with some metadata columns
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>> Thanks
> > >>> >>> > > > > > >>>> Gyula
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>> On Wed, Mar 26, 2025 at 7:33 AM Zakelly Lan <
> > >>> >>> > > > zakelly....@gmail.com>
> > >>> >>> > > > > > >>>> wrote:
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>>> Hi everyone,
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>> I'm fine with a seperate SQL connector for
> > metadata,
> > >>> so
> > >>> >>> maybe
> > >>> >>> > > we
> > >>> >>> > > > > > could
> > >>> >>> > > > > > >>>>> update the FLIP about our discussion? And
> Shengkai
> > >>> >>> provides a
> > >>> >>> > > PTF
> > >>> >>> > > > > > >>>>> implementation, does that also meet the
> > requirement?
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>> Best,
> > >>> >>> > > > > > >>>>> Zakelly
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>> On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <
> > >>> >>> > > > > > >>>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>> wrote:
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>>> Hi All,
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>> @Zakelly: Gyula summarised it correctly what I
> > >>> meant so
> > >>> >>> > please
> > >>> >>> > > > > treat
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>> content as mine.
> > >>> >>> > > > > > >>>>>> As an addition I'm not against to add CLI at
> all,
> > >>> I'm
> > >>> >>> just
> > >>> >>> > > > stating
> > >>> >>> > > > > > >>>> that
> > >>> >>> > > > > > >>>>> in
> > >>> >>> > > > > > >>>>>> some cases like this, users would like to have
> > >>> >>> > > > > > >>>>>> a self-serving solution where they can provide
> SQL
> > >>> >>> > statements
> > >>> >>> > > > > which
> > >>> >>> > > > > > >>>> can
> > >>> >>> > > > > > >>>>>> trigger alerts automatically.
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>> My personal opinion is that CLI would be
> > beneficial
> > >>> for
> > >>> >>> > > several
> > >>> >>> > > > > > >>>> cases. A
> > >>> >>> > > > > > >>>>>> good example is when users want to restart job
> > >>> >>> > > > > > >>>>>> from specific Kafka offsets which are persisted
> > in a
> > >>> >>> > > savepoint.
> > >>> >>> > > > > For
> > >>> >>> > > > > > >>>> such
> > >>> >>> > > > > > >>>>>> scenario users are more than happy since they
> > >>> >>> > > > > > >>>>>> expect manual intervention with full control. So
> > >>> all in
> > >>> >>> all
> > >>> >>> > > one
> > >>> >>> > > > > can
> > >>> >>> > > > > > >>>> count
> > >>> >>> > > > > > >>>>>> on my +1 when CLI FLIP would come up...
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>> BR,
> > >>> >>> > > > > > >>>>>> G
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <
> > >>> >>> > > > gyula.f...@gmail.com>
> > >>> >>> > > > > > >>>> wrote:
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>>> Hi!
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>> @Zakelly Lan <zakelly....@gmail.com>
> > >>> >>> > > > > > >>>>>>> I think what Gabor means is that users want to
> > have
> > >>> >>> > > predefined
> > >>> >>> > > > > SQL
> > >>> >>> > > > > > >>>>> scripts
> > >>> >>> > > > > > >>>>>>> to perform state analysis tasks to
> debug/identify
> > >>> >>> problems.
> > >>> >>> > > > > > >>>>>>> Such as write a SQL script that joins the
> > metadata
> > >>> >>> table
> > >>> >>> > with
> > >>> >>> > > > the
> > >>> >>> > > > > > >>>> state
> > >>> >>> > > > > > >>>>>>> and
> > >>> >>> > > > > > >>>>>>> do some analytics on it.
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>> If we have a meta table then the SQL script
> that
> > >>> can do
> > >>> >>> > this
> > >>> >>> > > is
> > >>> >>> > > > > > >> fixed
> > >>> >>> > > > > > >>>>> and
> > >>> >>> > > > > > >>>>>>> users can trigger this on demand by simply
> > >>> providing a
> > >>> >>> new
> > >>> >>> > > > > > >> savepoint
> > >>> >>> > > > > > >>>>> path.
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>> If we have a different mechanism to extract
> > >>> metadata
> > >>> >>> that
> > >>> >>> > is
> > >>> >>> > > > not
> > >>> >>> > > > > > >> SQL
> > >>> >>> > > > > > >>>>>>> native
> > >>> >>> > > > > > >>>>>>> then manual steps need to be executed and a
> > custom
> > >>> SQL
> > >>> >>> > script
> > >>> >>> > > > > would
> > >>> >>> > > > > > >>>> need
> > >>> >>> > > > > > >>>>>>> to
> > >>> >>> > > > > > >>>>>>> be written that adds the manually extracted
> > >>> metadata
> > >>> >>> into
> > >>> >>> > the
> > >>> >>> > > > > > >> script.
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>> Cheers,
> > >>> >>> > > > > > >>>>>>> Gyula
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <
> > >>> >>> > > > > zakelly....@gmail.com
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>>> Hi all,
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> Thanks for your answers! Getting everyone
> > aligned
> > >>> on
> > >>> >>> this
> > >>> >>> > > > topic
> > >>> >>> > > > > > >> is
> > >>> >>> > > > > > >>>>>>>> challenging, but it’s definitely worth the
> > effort
> > >>> >>> since it
> > >>> >>> > > > will
> > >>> >>> > > > > > >>>> help
> > >>> >>> > > > > > >>>>>>>> streamline things moving forward.
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> @Gabor are you saying that users are using
> some
> > >>> >>> scripts to
> > >>> >>> > > > > define
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>> SQL
> > >>> >>> > > > > > >>>>>>>> metadata connector and get the information,
> > >>> right? If
> > >>> >>> so,
> > >>> >>> > > > would
> > >>> >>> > > > > a
> > >>> >>> > > > > > >>>> CLI
> > >>> >>> > > > > > >>>>>>> tool
> > >>> >>> > > > > > >>>>>>>> be more convenient? It's easy to invoke and
> can
> > >>> get
> > >>> >>> the
> > >>> >>> > > result
> > >>> >>> > > > > > >>>>> swiftly.
> > >>> >>> > > > > > >>>>>>> And
> > >>> >>> > > > > > >>>>>>>> there should be some other systems to track
> the
> > >>> >>> checkpoint
> > >>> >>> > > > > > >> lineage
> > >>> >>> > > > > > >>>> and
> > >>> >>> > > > > > >>>>>>>> analyze if there are outliers in metadata
> (e.g.
> > >>> state
> > >>> >>> size
> > >>> >>> > > of
> > >>> >>> > > > > one
> > >>> >>> > > > > > >>>>>>> operator)
> > >>> >>> > > > > > >>>>>>>> right? Well, maybe I missed something so
> please
> > >>> >>> correct me
> > >>> >>> > > if
> > >>> >>> > > > > I'm
> > >>> >>> > > > > > >>>>> wrong.
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> I think the overall vision in Flink SQL is to
> > >>> provide
> > >>> >>> a
> > >>> >>> > SQL
> > >>> >>> > > > > > >> native
> > >>> >>> > > > > > >>>>>>>>> environment where we can serve complex
> > use-cases
> > >>> >>> like you
> > >>> >>> > > > would
> > >>> >>> > > > > > >>>>> expect
> > >>> >>> > > > > > >>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>> regular database.
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> @Gyula Well, this is a good point. From the
> > >>> >>> perspective of
> > >>> >>> > > > > > >>>>> comprehensive
> > >>> >>> > > > > > >>>>>>>> SQL experience, I'd +1 for treating metadata
> as
> > >>> data.
> > >>> >>> > > > Although I
> > >>> >>> > > > > > >>>> doubt
> > >>> >>> > > > > > >>>>>>> if
> > >>> >>> > > > > > >>>>>>>> there is a need for processing metadata, I
> won't
> > >>> be
> > >>> >>> > against
> > >>> >>> > > a
> > >>> >>> > > > > > >>>> separate
> > >>> >>> > > > > > >>>>>>>> connector.
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> Regarding the CLI tool, I still think it’s
> worth
> > >>> >>> > > implementing.
> > >>> >>> > > > > > >>>> Such a
> > >>> >>> > > > > > >>>>>>> tool
> > >>> >>> > > > > > >>>>>>>> could provide savepoint information before
> > >>> resuming
> > >>> >>> from a
> > >>> >>> > > > > > >>>> savepoint,
> > >>> >>> > > > > > >>>>>>> which
> > >>> >>> > > > > > >>>>>>>> would enhance the user experience in CLI-based
> > >>> >>> workflows.
> > >>> >>> > It
> > >>> >>> > > > > > >> would
> > >>> >>> > > > > > >>>> be
> > >>> >>> > > > > > >>>>>>> good
> > >>> >>> > > > > > >>>>>>>> if someone could implement this feature. We
> > >>> shouldn’t
> > >>> >>> > worry
> > >>> >>> > > > > about
> > >>> >>> > > > > > >>>>>>> whether
> > >>> >>> > > > > > >>>>>>>> this tool might be retired in the future.
> > >>> Regardless
> > >>> >>> of
> > >>> >>> > the
> > >>> >>> > > > > > >>>> SQL-based
> > >>> >>> > > > > > >>>>>>>> solution we eventually adopt, this capability
> > will
> > >>> >>> remain
> > >>> >>> > > > > > >> essential
> > >>> >>> > > > > > >>>>> for
> > >>> >>> > > > > > >>>>>>> CLI
> > >>> >>> > > > > > >>>>>>>> users. This is another topic.
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>> Zakelly
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>> On Thu, Mar 20, 2025 at 10:37 AM Shengkai
> Fang <
> > >>> >>> > > > > > >> fskm...@gmail.com>
> > >>> >>> > > > > > >>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> Hi.
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> After reading the doc[1], I think Spark
> > provides
> > >>> a
> > >>> >>> > function
> > >>> >>> > > > for
> > >>> >>> > > > > > >>>>> users
> > >>> >>> > > > > > >>>>>>> to
> > >>> >>> > > > > > >>>>>>>>> consume the metadata from the savepoint.  In
> > >>> Flink
> > >>> >>> SQL,
> > >>> >>> > > > similar
> > >>> >>> > > > > > >>>>>>>>> functionality is implemented through
> > Polymorphic
> > >>> >>> Table
> > >>> >>> > > > > > >> Functions
> > >>> >>> > > > > > >>>>>>> (PTF) as
> > >>> >>> > > > > > >>>>>>>>> proposed in FLIP-440[2]. Below is a code
> > >>> example[3]
> > >>> >>> > > > > > >> illustrating
> > >>> >>> > > > > > >>>>> this
> > >>> >>> > > > > > >>>>>>>>> concept:
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> ```
> > >>> >>> > > > > > >>>>>>>>>    public static class ScalarArgsFunction
> > extends
> > >>> >>> > > > > > >>>>>>>>> TestProcessTableFunctionBase {
> > >>> >>> > > > > > >>>>>>>>>        public void eval(Integer i, Boolean
> b) {
> > >>> >>> > > > > > >>>>>>>>>            collectObjects(i, b);
> > >>> >>> > > > > > >>>>>>>>>        }
> > >>> >>> > > > > > >>>>>>>>>    }
> > >>> >>> > > > > > >>>>>>>>> ```
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> ```
> > >>> >>> > > > > > >>>>>>>>> INSERT INTO sink SELECT * FROM f(i => 42, b
> =>
> > >>> >>> > CAST('TRUE'
> > >>> >>> > > AS
> > >>> >>> > > > > > >>>>>>> BOOLEAN))
> > >>> >>> > > > > > >>>>>>>>> ``
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> So we can add a builtin function named
> > >>> >>> > > `read_state_metadata`
> > >>> >>> > > > to
> > >>> >>> > > > > > >>>> read
> > >>> >>> > > > > > >>>>>>>>> savepoint data.
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL
> > >>> >>> > > > > > >>>>>>>>> [2]
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
> > >>> >>> > > > > > >>>>>>>>> [3]
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>> Gyula Fóra <gyula.f...@gmail.com>
> > 于2025年3月19日周三
> > >>> >>> 18:37写道:
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> Hi All!
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> Thank you for the answers and concerns from
> > >>> >>> everyone.
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> On the CLI vs State Metadata Connector/Table
> > >>> >>> question I
> > >>> >>> > > > would
> > >>> >>> > > > > > >>>> also
> > >>> >>> > > > > > >>>>>>> like
> > >>> >>> > > > > > >>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>> step back a little and look at the bigger
> > >>> picture.
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> I think the overall vision in Flink SQL is
> to
> > >>> >>> provide a
> > >>> >>> > > SQL
> > >>> >>> > > > > > >>>> native
> > >>> >>> > > > > > >>>>>>>>>> environment where we can serve complex
> > use-cases
> > >>> >>> like
> > >>> >>> > you
> > >>> >>> > > > > > >> would
> > >>> >>> > > > > > >>>>>>> expect
> > >>> >>> > > > > > >>>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>>> regular database.
> > >>> >>> > > > > > >>>>>>>>>> Most features, developments in the recent
> > years
> > >>> have
> > >>> >>> > gone
> > >>> >>> > > > > > >> this
> > >>> >>> > > > > > >>>>> way.
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> The State Metadata Table would be a natural
> > and
> > >>> >>> > > > > > >> straightforward
> > >>> >>> > > > > > >>>>> fit
> > >>> >>> > > > > > >>>>>>>> here.
> > >>> >>> > > > > > >>>>>>>>>> So from my side, +1 for that.
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> However I could understand if we are not
> ready
> > >>> to
> > >>> >>> add a
> > >>> >>> > > new
> > >>> >>> > > > > > >>>>>>>>>> connector/format due to maintenance concerns
> > >>> (and in
> > >>> >>> > > general
> > >>> >>> > > > > > >>>>> concern
> > >>> >>> > > > > > >>>>>>>>> about
> > >>> >>> > > > > > >>>>>>>>>> the design).
> > >>> >>> > > > > > >>>>>>>>>> If that's the issue then we should spend
> more
> > >>> time
> > >>> >>> on
> > >>> >>> > the
> > >>> >>> > > > > > >>>> design
> > >>> >>> > > > > > >>>>> to
> > >>> >>> > > > > > >>>>>>> get
> > >>> >>> > > > > > >>>>>>>>>> comfortable with the approach and seek
> > feedback
> > >>> >>> from the
> > >>> >>> > > > > > >> wider
> > >>> >>> > > > > > >>>>>>>> community
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> I am -1 for the CLI/tooling approach as that
> > >>> will
> > >>> >>> not
> > >>> >>> > > > provide
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>>> featureset we are looking for that is not
> > >>> already
> > >>> >>> > covered
> > >>> >>> > > by
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>> Java
> > >>> >>> > > > > > >>>>>>>>>> connector. And that approach would come with
> > the
> > >>> >>> same
> > >>> >>> > > > > > >>>> maintenance
> > >>> >>> > > > > > >>>>>>>>>> implications.
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> Cheers
> > >>> >>> > > > > > >>>>>>>>>> Gyula
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>> On Wed, Mar 19, 2025 at 11:24 AM Gabor
> > Somogyi <
> > >>> >>> > > > > > >>>>>>>>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> Hi Zaklely, Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> Several topics are going on so adding gist
> > >>> answers
> > >>> >>> to
> > >>> >>> > > them.
> > >>> >>> > > > > > >>>> When
> > >>> >>> > > > > > >>>>>>> some
> > >>> >>> > > > > > >>>>>>>>>> topic
> > >>> >>> > > > > > >>>>>>>>>>> is not touched please highlight it.
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> @Shengkai: I've read through all the
> previous
> > >>> FLIPs
> > >>> >>> > > related
> > >>> >>> > > > > > >>>>>>> catalogs
> > >>> >>> > > > > > >>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>> if
> > >>> >>> > > > > > >>>>>>>>>>> we would like to keep the concepts there
> > >>> >>> > > > > > >>>>>>>>>>> then one-to-one mapping relationship
> between
> > >>> >>> savepoint
> > >>> >>> > > and
> > >>> >>> > > > > > >>>>> catalog
> > >>> >>> > > > > > >>>>>>>> is a
> > >>> >>> > > > > > >>>>>>>>>>> reasonable direction. In short I'm happy
> that
> > >>> >>> > > > > > >>>>>>>>>>> you've highlighted this and agree as a
> whole.
> > >>> I've
> > >>> >>> > > written
> > >>> >>> > > > > > >> it
> > >>> >>> > > > > > >>>>> down
> > >>> >>> > > > > > >>>>>>>>>>> previously, just want to double confirm
> that
> > >>> state
> > >>> >>> > > catalog
> > >>> >>> > > > > > >> is
> > >>> >>> > > > > > >>>>>>>>>>> essential and planned. When we reach this
> > point
> > >>> >>> then
> > >>> >>> > your
> > >>> >>> > > > > > >>>> input
> > >>> >>> > > > > > >>>>> is
> > >>> >>> > > > > > >>>>>>>> more
> > >>> >>> > > > > > >>>>>>>>>>> than welcome.
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> @Zakelly: We've tried the CLI and separate
> > >>> library
> > >>> >>> > > > > > >> approaches
> > >>> >>> > > > > > >>>>> with
> > >>> >>> > > > > > >>>>>>>>> users
> > >>> >>> > > > > > >>>>>>>>>>> already and these are not something which
> is
> > >>> >>> welcome
> > >>> >>> > > > > > >> because
> > >>> >>> > > > > > >>>> of
> > >>> >>> > > > > > >>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>> following:
> > >>> >>> > > > > > >>>>>>>>>>> * Users want to have automated tasks and
> not
> > >>> manual
> > >>> >>> > > > > > >>>> CLI/library
> > >>> >>> > > > > > >>>>>>>> output
> > >>> >>> > > > > > >>>>>>>>>>> parsing. This can be hacked around but our
> > >>> >>> experience
> > >>> >>> > is
> > >>> >>> > > > > > >>>>> negative
> > >>> >>> > > > > > >>>>>>> on
> > >>> >>> > > > > > >>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>> because it's just brittle.
> > >>> >>> > > > > > >>>>>>>>>>> * From development perspective It's way
> much
> > >>> bigger
> > >>> >>> > > effort
> > >>> >>> > > > > > >>>> than
> > >>> >>> > > > > > >>>>> a
> > >>> >>> > > > > > >>>>>>>>>> connector
> > >>> >>> > > > > > >>>>>>>>>>> (hard to test, packaging/version handling
> is
> > >>> and
> > >>> >>> extra
> > >>> >>> > > > > > >> layer
> > >>> >>> > > > > > >>>> of
> > >>> >>> > > > > > >>>>>>>>>> complexity,
> > >>> >>> > > > > > >>>>>>>>>>> external FS authentication is pain for
> users,
> > >>> >>> expecting
> > >>> >>> > > > > > >> them
> > >>> >>> > > > > > >>>> to
> > >>> >>> > > > > > >>>>>>>>> download
> > >>> >>> > > > > > >>>>>>>>>>> savepoints also)
> > >>> >>> > > > > > >>>>>>>>>>> * Purely personal opinion but if we would
> > find
> > >>> >>> better
> > >>> >>> > > ways
> > >>> >>> > > > > > >>>> later
> > >>> >>> > > > > > >>>>>>> then
> > >>> >>> > > > > > >>>>>>>>>>> retire a CLI is not more lightweight than
> > >>> retire a
> > >>> >>> > > > > > >> connector
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> It would be great if you give some
> examples
> > >>> on how
> > >>> >>> > user
> > >>> >>> > > > > > >>>> could
> > >>> >>> > > > > > >>>>>>>>> leverage
> > >>> >>> > > > > > >>>>>>>>>>> the separate connector to process the
> > metadata.
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> The most simplest cases:
> > >>> >>> > > > > > >>>>>>>>>>> * give me the overgroving state uids
> > >>> >>> > > > > > >>>>>>>>>>> * give me the not known (new or renamed)
> > state
> > >>> uids
> > >>> >>> > > > > > >>>>>>>>>>> * give me the state uids where state size
> > >>> >>> drastically
> > >>> >>> > > > > > >> dropped
> > >>> >>> > > > > > >>>>>>> compare
> > >>> >>> > > > > > >>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>> previous savepoint (accidental state loss)
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> Since it was mentioned: as a general
> offtopic
> > >>> >>> teaser,
> > >>> >>> > > yeah
> > >>> >>> > > > > > >> it
> > >>> >>> > > > > > >>>>>>> would
> > >>> >>> > > > > > >>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>> good
> > >>> >>> > > > > > >>>>>>>>>>> to have some sort of checkpoint/savepoint
> > >>> lineage
> > >>> >>> or
> > >>> >>> > > > > > >> however
> > >>> >>> > > > > > >>>> we
> > >>> >>> > > > > > >>>>>>> call
> > >>> >>> > > > > > >>>>>>>>> it.
> > >>> >>> > > > > > >>>>>>>>>>> Since we've not yet reached this point
> there
> > >>> are no
> > >>> >>> > > > > > >> technical
> > >>> >>> > > > > > >>>>>>>> details,
> > >>> >>> > > > > > >>>>>>>>>> it's
> > >>> >>> > > > > > >>>>>>>>>>> more like a vision. It's a common pattern
> > that
> > >>> >>> > > > > > >>>>>>>>>>> jobs are physically running but somehow the
> > >>> state
> > >>> >>> > > > > > >> processing
> > >>> >>> > > > > > >>>> is
> > >>> >>> > > > > > >>>>>>> stuck
> > >>> >>> > > > > > >>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>> it would be good to add some way to find it
> > out
> > >>> >>> > > > > > >>>> automatically.
> > >>> >>> > > > > > >>>>>>>>>>> The important saying here is automation and
> > not
> > >>> >>> manual
> > >>> >>> > > > > > >>>>> evaluation
> > >>> >>> > > > > > >>>>>>>> since
> > >>> >>> > > > > > >>>>>>>>>>> handling 10k+ jobs is just not allowing
> that.
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>> On Wed, Mar 19, 2025 at 6:46 AM Shengkai
> > Fang <
> > >>> >>> > > > > > >>>>> fskm...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> Hi, All.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> About State Catalog, I want to share more
> > >>> thoughts
> > >>> >>> > about
> > >>> >>> > > > > > >>>> this.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> In the initial design concept, I
> understood
> > >>> that a
> > >>> >>> > > > > > >>>> savepoint
> > >>> >>> > > > > > >>>>>>> and a
> > >>> >>> > > > > > >>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>> catalog have a one-to-one mapping
> > >>> relationship.
> > >>> >>> Each
> > >>> >>> > > > > > >>>> operator
> > >>> >>> > > > > > >>>>>>>>>> corresponds
> > >>> >>> > > > > > >>>>>>>>>>>> to a database, and the state of each
> > operator
> > >>> is
> > >>> >>> > > > > > >>>> represented
> > >>> >>> > > > > > >>>>> as
> > >>> >>> > > > > > >>>>>>>>>>> individual
> > >>> >>> > > > > > >>>>>>>>>>>> tables. The rationale behind this design
> is:
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> *State Diversity*: An operator may involve
> > >>> >>> multiple
> > >>> >>> > > types
> > >>> >>> > > > > > >>>> of
> > >>> >>> > > > > > >>>>>>>> states.
> > >>> >>> > > > > > >>>>>>>>>> For
> > >>> >>> > > > > > >>>>>>>>>>>> example, in our VVR design, a "multi-join"
> > >>> >>> operator
> > >>> >>> > uses
> > >>> >>> > > > > > >>>> keyed
> > >>> >>> > > > > > >>>>>>>> states
> > >>> >>> > > > > > >>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>> two input streams and a broadcast state
> for
> > >>> the
> > >>> >>> third
> > >>> >>> > > > > > >>>> stream.
> > >>> >>> > > > > > >>>>>>> This
> > >>> >>> > > > > > >>>>>>>>>> makes
> > >>> >>> > > > > > >>>>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>> challenging to represent all states of an
> > >>> operator
> > >>> >>> > > > > > >> within a
> > >>> >>> > > > > > >>>>>>> single
> > >>> >>> > > > > > >>>>>>>>>> table.
> > >>> >>> > > > > > >>>>>>>>>>>> *Scalability*: Internally, an operator
> might
> > >>> have
> > >>> >>> > > > > > >> multiple
> > >>> >>> > > > > > >>>>> keyed
> > >>> >>> > > > > > >>>>>>>>> states
> > >>> >>> > > > > > >>>>>>>>>>>> (e.g., value state and list state).
> However,
> > >>> large
> > >>> >>> > list
> > >>> >>> > > > > > >>>> states
> > >>> >>> > > > > > >>>>>>> may
> > >>> >>> > > > > > >>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>> fit
> > >>> >>> > > > > > >>>>>>>>>>>> entirely in memory. To address this, we
> > >>> recommend
> > >>> >>> > > > > > >>>> implementing
> > >>> >>> > > > > > >>>>>>> each
> > >>> >>> > > > > > >>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>> as a separate table.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> To resolve the loosely coupled
> relationships
> > >>> >>> between
> > >>> >>> > > > > > >>>> operator
> > >>> >>> > > > > > >>>>>>>> states,
> > >>> >>> > > > > > >>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>> propose embedding predefined views within
> > the
> > >>> >>> catalog.
> > >>> >>> > > > > > >>>> These
> > >>> >>> > > > > > >>>>>>> views
> > >>> >>> > > > > > >>>>>>>>>>> simplify
> > >>> >>> > > > > > >>>>>>>>>>>> user understanding of operator
> > >>> implementations and
> > >>> >>> > > > > > >> provide
> > >>> >>> > > > > > >>>> a
> > >>> >>> > > > > > >>>>>>> more
> > >>> >>> > > > > > >>>>>>>>>>> intuitive
> > >>> >>> > > > > > >>>>>>>>>>>> perspective. For instance, a join operator
> > may
> > >>> >>> have
> > >>> >>> > > > > > >>>> multiple
> > >>> >>> > > > > > >>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>> implementations (depending on whether the
> > >>> join key
> > >>> >>> > > > > > >> includes
> > >>> >>> > > > > > >>>>>>> unique
> > >>> >>> > > > > > >>>>>>>>>>>> attributes), but users primarily care
> about
> > >>> the
> > >>> >>> data
> > >>> >>> > > > > > >>>>> associated
> > >>> >>> > > > > > >>>>>>>> with
> > >>> >>> > > > > > >>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>>> specific join key across input streams.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> Returning to the one-to-one mapping
> between
> > >>> >>> savepoints
> > >>> >>> > > > > > >> and
> > >>> >>> > > > > > >>>>>>>> catalogs,
> > >>> >>> > > > > > >>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>> aim
> > >>> >>> > > > > > >>>>>>>>>>>> to manage multiple user state catalogs
> > >>> through a
> > >>> >>> > catalog
> > >>> >>> > > > > > >>>>> store.
> > >>> >>> > > > > > >>>>>>>> When
> > >>> >>> > > > > > >>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>> user
> > >>> >>> > > > > > >>>>>>>>>>>> triggers a savepoint for a job on the
> > >>> platform:
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> 1. The platform sends a REST request to
> the
> > >>> >>> > JobManager.
> > >>> >>> > > > > > >>>>>>>>>>>> 2. Simultaneously, it registers a new
> state
> > >>> >>> catalog in
> > >>> >>> > > > > > >> the
> > >>> >>> > > > > > >>>>>>> catalog
> > >>> >>> > > > > > >>>>>>>>>> store,
> > >>> >>> > > > > > >>>>>>>>>>>> enabling immediate analysis of state data
> on
> > >>> the
> > >>> >>> > > > > > >> platform.
> > >>> >>> > > > > > >>>>>>>>>>>> 3. Deleting a savepoint would also trigger
> > the
> > >>> >>> removal
> > >>> >>> > > of
> > >>> >>> > > > > > >>>> its
> > >>> >>> > > > > > >>>>>>>>>> associated
> > >>> >>> > > > > > >>>>>>>>>>>> catalog.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> This vision assumes that states are
> > >>> >>> self-describing or
> > >>> >>> > > > > > >>>> that a
> > >>> >>> > > > > > >>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>> metaservice is introduced to analyze
> > savepoint
> > >>> >>> > > > > > >> structures.
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> How can users create logic to identify
> > >>> >>> differences
> > >>> >>> > > > > > >>>> between
> > >>> >>> > > > > > >>>>>>>> multiple
> > >>> >>> > > > > > >>>>>>>>>>>> savepoints?
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> Since savepoints and state catalogs are
> > >>> one-to-one
> > >>> >>> > > > > > >> mapped,
> > >>> >>> > > > > > >>>>> users
> > >>> >>> > > > > > >>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>> query
> > >>> >>> > > > > > >>>>>>>>>>>> metadata via their respective catalogs.
> For
> > >>> >>> example:
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> 1.
> > >>> >>> > > > > > >>>>>
> > >>> >>> `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>`
> > >>> >>> > > > > > >>>>>>>>>> provides
> > >>> >>> > > > > > >>>>>>>>>>>> operator-specific metadata (e.g., state
> > size,
> > >>> >>> type).
> > >>> >>> > > > > > >>>>>>>>>>>> 2. Comparing metadata tables (e.g., schema
> > >>> >>> versions,
> > >>> >>> > > > > > >> state
> > >>> >>> > > > > > >>>>> entry
> > >>> >>> > > > > > >>>>>>>>>> counts)
> > >>> >>> > > > > > >>>>>>>>>>>> across catalogs reveals structural or
> > >>> quantitative
> > >>> >>> > > > > > >>>>> differences.
> > >>> >>> > > > > > >>>>>>>>>>>> 3. For deeper analysis, users could write
> > SQL
> > >>> >>> queries
> > >>> >>> > to
> > >>> >>> > > > > > >>>>> compare
> > >>> >>> > > > > > >>>>>>>>>> specific
> > >>> >>> > > > > > >>>>>>>>>>>> state partitions or leverage the
> metaservice
> > >>> to
> > >>> >>> track
> > >>> >>> > > > > > >> state
> > >>> >>> > > > > > >>>>>>>> evolution
> > >>> >>> > > > > > >>>>>>>>>>>> (e.g., added/removed operators, modified
> > state
> > >>> >>> > > > > > >>>>> configurations).
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> If we plan to introduce a state catalog in
> > the
> > >>> >>> > future, I
> > >>> >>> > > > > > >>>> would
> > >>> >>> > > > > > >>>>>>> lean
> > >>> >>> > > > > > >>>>>>>>>>> toward
> > >>> >>> > > > > > >>>>>>>>>>>> using metadata tables. If a utility tool
> can
> > >>> >>> address
> > >>> >>> > the
> > >>> >>> > > > > > >>>>>>> challenges
> > >>> >>> > > > > > >>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>> face, could we avoid introducing an
> > additional
> > >>> >>> > > connector?
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> Gyula Fóra <gyula.f...@gmail.com>
> > >>> 于2025年3月17日周一
> > >>> >>> > > 20:25写道:
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> Hi All!
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> Without going into too much detail here
> are
> > >>> my 2
> > >>> >>> > cents
> > >>> >>> > > > > > >>>>>>> regarding
> > >>> >>> > > > > > >>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> virtual column / catalog metadata / table
> > >>> >>> (connector)
> > >>> >>> > > > > > >>>>>>> discussion
> > >>> >>> > > > > > >>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> State metadata.
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> State metadata such as the types of
> states,
> > >>> their
> > >>> >>> > > > > > >>>>> properties,
> > >>> >>> > > > > > >>>>>>>>> names,
> > >>> >>> > > > > > >>>>>>>>>>>> sizes
> > >>> >>> > > > > > >>>>>>>>>>>>> etc are all valuable information that can
> > be
> > >>> >>> used to
> > >>> >>> > > > > > >>>> enrich
> > >>> >>> > > > > > >>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> computations we do on state.
> > >>> >>> > > > > > >>>>>>>>>>>>> We can either analyze it standalone (such
> > as
> > >>> >>> discover
> > >>> >>> > > > > > >>>>>>> anomalies,
> > >>> >>> > > > > > >>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>> large
> > >>> >>> > > > > > >>>>>>>>>>>>> jobs with many states), across multiple
> > >>> >>> savepoints
> > >>> >>> > > > > > >>>> (discover
> > >>> >>> > > > > > >>>>>>> how
> > >>> >>> > > > > > >>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>> changed over time) or by joining it with
> > >>> keyed or
> > >>> >>> > > > > > >>>> non-keyed
> > >>> >>> > > > > > >>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>> data
> > >>> >>> > > > > > >>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>> serve more complex queries on the state.
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> The only solution that seems to serve all
> > >>> these
> > >>> >>> > > > > > >> use-cases
> > >>> >>> > > > > > >>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>> requirements
> > >>> >>> > > > > > >>>>>>>>>>>>> in a straightforward and SQL canonical
> way
> > >>> is to
> > >>> >>> > simply
> > >>> >>> > > > > > >>>>> expose
> > >>> >>> > > > > > >>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>> metadata as a separate table. This is a
> > >>> metadata
> > >>> >>> > table
> > >>> >>> > > > > > >>>> but
> > >>> >>> > > > > > >>>>> you
> > >>> >>> > > > > > >>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>> also
> > >>> >>> > > > > > >>>>>>>>>>>>> think of it as data table, it makes no
> > >>> practical
> > >>> >>> > > > > > >>>> difference
> > >>> >>> > > > > > >>>>>>> here.
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> Once we have a catalog later, the catalog
> > can
> > >>> >>> offer
> > >>> >>> > > > > > >> this
> > >>> >>> > > > > > >>>>> table
> > >>> >>> > > > > > >>>>>>>> out
> > >>> >>> > > > > > >>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> box, the same way databases provide
> > metadata
> > >>> >>> tables.
> > >>> >>> > > > > > >> For
> > >>> >>> > > > > > >>>>> this
> > >>> >>> > > > > > >>>>>>> to
> > >>> >>> > > > > > >>>>>>>>> work
> > >>> >>> > > > > > >>>>>>>>>>>>> however we need another, simpler
> connector
> > >>> that
> > >>> >>> > creates
> > >>> >>> > > > > > >>>> this
> > >>> >>> > > > > > >>>>>>>> table.
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> +1 for state metadata as a separate
> > >>> >>> connector/table,
> > >>> >>> > > > > > >>>> instead
> > >>> >>> > > > > > >>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>> adding
> > >>> >>> > > > > > >>>>>>>>>>>>> virtual columns and adhoc catalog
> metadata
> > >>> that
> > >>> >>> is
> > >>> >>> > hard
> > >>> >>> > > > > > >>>> to
> > >>> >>> > > > > > >>>>> use
> > >>> >>> > > > > > >>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>>>> large
> > >>> >>> > > > > > >>>>>>>>>>>>> number of queries.
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> Cheers,
> > >>> >>> > > > > > >>>>>>>>>>>>> Gyula
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>> On Mon, Mar 17, 2025 at 12:44 PM Gabor
> > >>> Somogyi <
> > >>> >>> > > > > > >>>>>>>>>>>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>> I’m planning on adding this, and we may
> > >>> >>> collaborate
> > >>> >>> > > > > > >>>> on
> > >>> >>> > > > > > >>>>> it
> > >>> >>> > > > > > >>>>>>> in
> > >>> >>> > > > > > >>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> future.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> +1 on this, just ping me.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> After some code digging and POC all I
> can
> > >>> say
> > >>> >>> that
> > >>> >>> > > > > > >> with
> > >>> >>> > > > > > >>>>>>> heavy
> > >>> >>> > > > > > >>>>>>>>>> effort
> > >>> >>> > > > > > >>>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>> maybe add such changes that we're able
> to
> > >>> show
> > >>> >>> > > > > > >> metadata
> > >>> >>> > > > > > >>>>> of a
> > >>> >>> > > > > > >>>>>>>>>>> savepoint
> > >>> >>> > > > > > >>>>>>>>>>>>> from
> > >>> >>> > > > > > >>>>>>>>>>>>>> catalog.
> > >>> >>> > > > > > >>>>>>>>>>>>>> I'm not against that but from user
> > >>> perspective
> > >>> >>> this
> > >>> >>> > > > > > >> has
> > >>> >>> > > > > > >>>>>>> limited
> > >>> >>> > > > > > >>>>>>>>>>> value,
> > >>> >>> > > > > > >>>>>>>>>>>>> let
> > >>> >>> > > > > > >>>>>>>>>>>>>> me explain why.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> From high level perspective I see the
> > >>> following
> > >>> >>> > > > > > >> which I
> > >>> >>> > > > > > >>>>> see
> > >>> >>> > > > > > >>>>>>>>>> agreement
> > >>> >>> > > > > > >>>>>>>>>>>> on:
> > >>> >>> > > > > > >>>>>>>>>>>>>> * We should have a catalog which is
> > >>> >>> representing one
> > >>> >>> > > > > > >> or
> > >>> >>> > > > > > >>>>> more
> > >>> >>> > > > > > >>>>>>>> jobs
> > >>> >>> > > > > > >>>>>>>>>>>>> savepoint
> > >>> >>> > > > > > >>>>>>>>>>>>>> data set (future plan)
> > >>> >>> > > > > > >>>>>>>>>>>>>> * Savepoints should be able to be
> > >>> registered in
> > >>> >>> the
> > >>> >>> > > > > > >>>>> catalog
> > >>> >>> > > > > > >>>>>>>> which
> > >>> >>> > > > > > >>>>>>>>>> are
> > >>> >>> > > > > > >>>>>>>>>>>>> then
> > >>> >>> > > > > > >>>>>>>>>>>>>> databases (future plan)
> > >>> >>> > > > > > >>>>>>>>>>>>>> * There must be a possiblity to create
> > >>> tables
> > >>> >>> from
> > >>> >>> > > > > > >>>>> databases
> > >>> >>> > > > > > >>>>>>>>> where
> > >>> >>> > > > > > >>>>>>>>>>>> users
> > >>> >>> > > > > > >>>>>>>>>>>>>> can read state data (exists already)
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> In terms of metadata, If I understand
> > >>> correctly
> > >>> >>> then
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>> suggested
> > >>> >>> > > > > > >>>>>>>>>>>>> approach
> > >>> >>> > > > > > >>>>>>>>>>>>>> would be to access
> > >>> >>> > > > > > >>>>>>>>>>>>>> it from the catalog describe command,
> > right?
> > >>> >>> Adding
> > >>> >>> > > > > > >>>> that
> > >>> >>> > > > > > >>>>>>> info
> > >>> >>> > > > > > >>>>>>>>> when
> > >>> >>> > > > > > >>>>>>>>>>>>> specific
> > >>> >>> > > > > > >>>>>>>>>>>>>> database describe command
> > >>> >>> > > > > > >>>>>>>>>>>>>> is executed could be done.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> The question is for instance how can
> users
> > >>> >>> create
> > >>> >>> > > > > > >> such
> > >>> >>> > > > > > >>>> a
> > >>> >>> > > > > > >>>>>>> logic
> > >>> >>> > > > > > >>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>> tells
> > >>> >>> > > > > > >>>>>>>>>>>>>> them what is
> > >>> >>> > > > > > >>>>>>>>>>>>>> the difference between multiple
> > savepoints?
> > >>> >>> > > > > > >>>>>>>>>>>>>> Just to give some examples:
> > >>> >>> > > > > > >>>>>>>>>>>>>> * per operator size changes between
> > >>> savepoints
> > >>> >>> > > > > > >>>>>>>>>>>>>> * show values from operator data where
> > state
> > >>> >>> size
> > >>> >>> > > > > > >>>> reaches
> > >>> >>> > > > > > >>>>> a
> > >>> >>> > > > > > >>>>>>>>>> boundary
> > >>> >>> > > > > > >>>>>>>>>>>>>> * in general "find which checkpoint
> ruined
> > >>> >>> things"
> > >>> >>> > is
> > >>> >>> > > > > > >>>>> quite
> > >>> >>> > > > > > >>>>>>>>> common
> > >>> >>> > > > > > >>>>>>>>>>>>> pattern
> > >>> >>> > > > > > >>>>>>>>>>>>>> What I would like to highlight here is
> > that
> > >>> from
> > >>> >>> > > > > > >> Flink
> > >>> >>> > > > > > >>>>>>> point of
> > >>> >>> > > > > > >>>>>>>>>> view
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> metadata can be
> > >>> >>> > > > > > >>>>>>>>>>>>>> considered as a static side output
> > >>> information
> > >>> >>> but
> > >>> >>> > > > > > >> for
> > >>> >>> > > > > > >>>>> users
> > >>> >>> > > > > > >>>>>>>>> these
> > >>> >>> > > > > > >>>>>>>>>>>> values
> > >>> >>> > > > > > >>>>>>>>>>>>>> are actual real data
> > >>> >>> > > > > > >>>>>>>>>>>>>> where logic is planned to build around.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>> The metadata is more like one-time
> > >>> information
> > >>> >>> > > > > > >>>> instead
> > >>> >>> > > > > > >>>>> of
> > >>> >>> > > > > > >>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>> streaming
> > >>> >>> > > > > > >>>>>>>>>>>>>> data that changes all
> > >>> >>> > > > > > >>>>>>>>>>>>>> the time, so a single connector seems to
> > be
> > >>> an
> > >>> >>> > > > > > >>>> overkill.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> State data is also static within a
> > >>> savepoint and
> > >>> >>> > > > > > >> that's
> > >>> >>> > > > > > >>>>> the
> > >>> >>> > > > > > >>>>>>>>> reason
> > >>> >>> > > > > > >>>>>>>>>>> why
> > >>> >>> > > > > > >>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> state processor API is working in batch
> > >>> mode.
> > >>> >>> > > > > > >>>>>>>>>>>>>> When we handle multiple checkpoints in a
> > >>> >>> streaming
> > >>> >>> > > > > > >>>> fashion
> > >>> >>> > > > > > >>>>>>> then
> > >>> >>> > > > > > >>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>> viewed from another angle.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> We can come up with more lightweight
> > >>> solution
> > >>> >>> other
> > >>> >>> > > > > > >>>> than a
> > >>> >>> > > > > > >>>>>>> new
> > >>> >>> > > > > > >>>>>>>>>>>> connector
> > >>> >>> > > > > > >>>>>>>>>>>>>> but enforcing users to parse the catalog
> > >>> >>> > > > > > >>>>>>>>>>>>>> describe command output in order to
> > compare
> > >>> >>> multiple
> > >>> >>> > > > > > >>>>>>> savepoints
> > >>> >>> > > > > > >>>>>>>>>>> doesn't
> > >>> >>> > > > > > >>>>>>>>>>>>>> sound smooth user experience.
> > >>> >>> > > > > > >>>>>>>>>>>>>> Honestly I've no other idea how exposing
> > >>> >>> metadata as
> > >>> >>> > > > > > >>>> real
> > >>> >>> > > > > > >>>>>>> user
> > >>> >>> > > > > > >>>>>>>>> data
> > >>> >>> > > > > > >>>>>>>>>>> so
> > >>> >>> > > > > > >>>>>>>>>>>>>> waiting on other approaches.
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> On Thu, Mar 13, 2025 at 2:44 AM Shengkai
> > >>> Fang <
> > >>> >>> > > > > > >>>>>>>> fskm...@gmail.com
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Looking forward to hearing the good
> news!
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Gabor Somogyi <
> gabor.g.somo...@gmail.com
> > >
> > >>> >>> > > > > > >>>> 于2025年3月12日周三
> > >>> >>> > > > > > >>>>>>>>> 22:24写道:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> Thanks for both the valuable input!
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> Let me take a closer look at the
> > >>> suggestions,
> > >>> >>> > > > > > >> like
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>> Catalog
> > >>> >>> > > > > > >>>>>>>>>>>>>>> capabilities
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> and possibility of embedding
> > >>> TypeInformation
> > >>> >>> or
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> StateDescriptor metadata directly into
> > >>> the raw
> > >>> >>> > > > > > >>>> state
> > >>> >>> > > > > > >>>>>>>> files...
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 8:17 AM
> Shengkai
> > >>> Fang
> > >>> >>> <
> > >>> >>> > > > > > >>>>>>>>>> fskm...@gmail.com
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Thanks for Zakelly's clarification.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> +1 to delay the discussion about
> this.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> I’d like to share my perspective on
> the
> > >>> State
> > >>> >>> > > > > > >>>>> Catalog
> > >>> >>> > > > > > >>>>>>>>>> proposal.
> > >>> >>> > > > > > >>>>>>>>>>>>> While
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> introducing this capability is
> > >>> beneficial,
> > >>> >>> > > > > > >> there
> > >>> >>> > > > > > >>>> is
> > >>> >>> > > > > > >>>>> a
> > >>> >>> > > > > > >>>>>>>>>> blocker:
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> current
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateBackend architecture does not
> > permit
> > >>> >>> > > > > > >>>> operators
> > >>> >>> > > > > > >>>>> to
> > >>> >>> > > > > > >>>>>>>>> encode
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> TypeInformation into the state—it
> only
> > >>> >>> > > > > > >> preserves
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>>>> Serializer.
> > >>> >>> > > > > > >>>>>>>>>>>>> This
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> limitation creates an asymmetry, as
> > >>> operators
> > >>> >>> > > > > > >>>> alone
> > >>> >>> > > > > > >>>>>>>> retain
> > >>> >>> > > > > > >>>>>>>>>>>>> knowledge
> > >>> >>> > > > > > >>>>>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> data structure’s schema.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> To address this, I suggest allowing
> > >>> operators
> > >>> >>> > > > > > >> to
> > >>> >>> > > > > > >>>>> embed
> > >>> >>> > > > > > >>>>>>>>>>>>>> TypeInformation
> > >>> >>> > > > > > >>>>>>>>>>>>>>> or
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateDescriptor metadata directly
> into
> > >>> the
> > >>> >>> raw
> > >>> >>> > > > > > >>>> state
> > >>> >>> > > > > > >>>>>>>> files.
> > >>> >>> > > > > > >>>>>>>>>>> Such
> > >>> >>> > > > > > >>>>>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>>>>>> design
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> would enable the Catalog to:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. Parse state files and
> > programmatically
> > >>> >>> > > > > > >> derive
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>> schema
> > >>> >>> > > > > > >>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> structural
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> guarantees for each state.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Leverage existing Flink Table
> > >>> utilities,
> > >>> >>> > > > > > >> such
> > >>> >>> > > > > > >>>> as
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> LegacyTypeInfoDataTypeConverter (in
> > >>> >>> > > > > > >>>>>>>>>>>>>>> org.apache.flink.table.types.utils),
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> bridge TypeInformation and DataType
> > >>> >>> > > > > > >> conversions.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> If we can not store the
> TypeInformation
> > >>> or
> > >>> >>> > > > > > >>>>>>>> StateDescriptor
> > >>> >>> > > > > > >>>>>>>>>> into
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> raw
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> state files, I am +1 for this FLIP to
> > use
> > >>> >>> > > > > > >>>> metadata
> > >>> >>> > > > > > >>>>>>> column
> > >>> >>> > > > > > >>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>> retrieve
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> information.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Zakelly Lan <zakelly....@gmail.com>
> > >>> >>> > > > > > >>>> 于2025年3月12日周三
> > >>> >>> > > > > > >>>>>>>> 12:43写道:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Hi Gabor and Shengkai,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Thanks for sharing your thoughts!
> This
> > >>> is a
> > >>> >>> > > > > > >>>> long
> > >>> >>> > > > > > >>>>>>>>> discussion
> > >>> >>> > > > > > >>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>> sorry
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the late reply (I'm busy catching up
> > >>> with
> > >>> >>> > > > > > >>>> release
> > >>> >>> > > > > > >>>>>>> 2.0
> > >>> >>> > > > > > >>>>>>>>> these
> > >>> >>> > > > > > >>>>>>>>>>>>> days).
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Let me first clarify your thoughts
> to
> > >>> ensure
> > >>> >>> > > > > > >> I
> > >>> >>> > > > > > >>>>>>>> understand
> > >>> >>> > > > > > >>>>>>>>>>>>>> correctly.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> IIUC,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> there is no persistent configuration
> > for
> > >>> >>> > > > > > >> state
> > >>> >>> > > > > > >>>> TTL
> > >>> >>> > > > > > >>>>>>> in
> > >>> >>> > > > > > >>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> checkpoint.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> While
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you can infer that TTL is enabled by
> > >>> reading
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>>> serializer,
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> checkpoint
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> itself only stores the last access
> > time
> > >>> for
> > >>> >>> > > > > > >>>> each
> > >>> >>> > > > > > >>>>>>> value.
> > >>> >>> > > > > > >>>>>>>>> So
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> only
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> thing
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> we can show is the last access time
> > for
> > >>> each
> > >>> >>> > > > > > >>>>> value.
> > >>> >>> > > > > > >>>>>>> But
> > >>> >>> > > > > > >>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> required
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> for all state backends to store
> this,
> > as
> > >>> >>> they
> > >>> >>> > > > > > >>>> may
> > >>> >>> > > > > > >>>>>>>>> directly
> > >>> >>> > > > > > >>>>>>>>>>>> store
> > >>> >>> > > > > > >>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> expired time. This will also
> increase
> > >>> the
> > >>> >>> > > > > > >>>>>>> difficulty of
> > >>> >>> > > > > > >>>>>>>>>>>>>>> implementation
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> &
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> maintenance.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> This once again reiterates the
> > >>> importance of
> > >>> >>> > > > > > >>>>> unified
> > >>> >>> > > > > > >>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> checkpoints. I’m planning on adding
> > >>> this,
> > >>> >>> and
> > >>> >>> > > > > > >>>> we
> > >>> >>> > > > > > >>>>> may
> > >>> >>> > > > > > >>>>>>>>>>>> collaborate
> > >>> >>> > > > > > >>>>>>>>>>>>> on
> > >>> >>> > > > > > >>>>>>>>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the future.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata
> Column
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I'm not in favor of adding a new
> > >>> connector
> > >>> >>> > > > > > >> for
> > >>> >>> > > > > > >>>>>>>> metadata.
> > >>> >>> > > > > > >>>>>>>>>> The
> > >>> >>> > > > > > >>>>>>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> more like one-time information
> instead
> > >>> of a
> > >>> >>> > > > > > >>>>>>> streaming
> > >>> >>> > > > > > >>>>>>>>> data
> > >>> >>> > > > > > >>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>> changes
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> all
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the time, so a single connector
> seems
> > >>> to be
> > >>> >>> > > > > > >> an
> > >>> >>> > > > > > >>>>>>>> overkill.
> > >>> >>> > > > > > >>>>>>>>> It
> > >>> >>> > > > > > >>>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>>>>> easy
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> withdraw a connector if we have a
> > better
> > >>> >>> > > > > > >>>> solution
> > >>> >>> > > > > > >>>>> in
> > >>> >>> > > > > > >>>>>>>>>> future.
> > >>> >>> > > > > > >>>>>>>>>>>> I'm
> > >>> >>> > > > > > >>>>>>>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> familiar with current Catalog
> > >>> capabilities,
> > >>> >>> > > > > > >>>> and if
> > >>> >>> > > > > > >>>>>>> it
> > >>> >>> > > > > > >>>>>>>>> could
> > >>> >>> > > > > > >>>>>>>>>>>>> extract
> > >>> >>> > > > > > >>>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> show some operator-level information
> > >>> from
> > >>> >>> > > > > > >>>>> savepoint,
> > >>> >>> > > > > > >>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> great.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> If the Catalog can't do that, I
> would
> > >>> >>> > > > > > >> consider
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>> current
> > >>> >>> > > > > > >>>>>>>>>>> FLIP
> > >>> >>> > > > > > >>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>> be a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> compromise solution.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> And if we have that unified metadata
> > for
> > >>> >>> > > > > > >>>>>>>>>> checkpoint/savepoint
> > >>> >>> > > > > > >>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> future,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> may directly register savepoint in
> > >>> catalog,
> > >>> >>> > > > > > >> and
> > >>> >>> > > > > > >>>>>>> create
> > >>> >>> > > > > > >>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>> source
> > >>> >>> > > > > > >>>>>>>>>>>>>>> without
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> specifying complex columns, as well
> as
> > >>> >>> > > > > > >> describe
> > >>> >>> > > > > > >>>>> the
> > >>> >>> > > > > > >>>>>>>>>> savepoint
> > >>> >>> > > > > > >>>>>>>>>>>>>> catalog
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> get the metadata. That's a good
> > >>> solution in
> > >>> >>> > > > > > >> my
> > >>> >>> > > > > > >>>>> mind.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Zakelly
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 10:35 AM
> > >>> Shengkai
> > >>> >>> > > > > > >> Fang
> > >>> >>> > > > > > >>>> <
> > >>> >>> > > > > > >>>>>>>>>>>>> fskm...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Hi Gabor,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> > >>> >>> > > > > > >>>>>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> I would argue against introducing a
> > new
> > >>> >>> > > > > > >>>>> connector
> > >>> >>> > > > > > >>>>>>>> type
> > >>> >>> > > > > > >>>>>>>>>>> named
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata, as the existing
> > >>> Catalog
> > >>> >>> > > > > > >>>>>>> mechanism
> > >>> >>> > > > > > >>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>> inherently
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> provide the necessary connector
> > factory
> > >>> >>> > > > > > >>>>>>> capabilities.
> > >>> >>> > > > > > >>>>>>>>>> I’ve
> > >>> >>> > > > > > >>>>>>>>>>>>>> detailed
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposal in branch[1]. Please take
> a
> > >>> moment
> > >>> >>> > > > > > >>>> to
> > >>> >>> > > > > > >>>>>>> review
> > >>> >>> > > > > > >>>>>>>>> it.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> If we introduce a connector named
> > >>> >>> > > > > > >>>>>>>> `savepoint-metadata`,
> > >>> >>> > > > > > >>>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>>> means
> > >>> >>> > > > > > >>>>>>>>>>>>>>> user
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> create a temporary table with
> > connector
> > >>> >>> > > > > > >>>>>>>>>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> connector needs to check whether
> > table
> > >>> >>> > > > > > >>>> schema is
> > >>> >>> > > > > > >>>>>>> same
> > >>> >>> > > > > > >>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> schema
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposed in the FLIP. On the other
> > >>> hand,
> > >>> >>> > > > > > >> it's
> > >>> >>> > > > > > >>>>> not
> > >>> >>> > > > > > >>>>>>>> easy
> > >>> >>> > > > > > >>>>>>>>>> work
> > >>> >>> > > > > > >>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> others
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users a metadata table with same
> > >>> schema.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> > >>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>> 于2025年3月11日周二
> > >>> >>> > > > > > >>>>>>>>>>>>> 16:56写道:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> From directional perspective I
> agree
> > >>> your
> > >>> >>> > > > > > >>>> idea
> > >>> >>> > > > > > >>>>>>> how
> > >>> >>> > > > > > >>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> implemented.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Previously I've mentioned that TTL
> > >>> >>> > > > > > >>>> information
> > >>> >>> > > > > > >>>>>>> is
> > >>> >>> > > > > > >>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>> exposed
> > >>> >>> > > > > > >>>>>>>>>>>>>> on
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> processor API (which the SQL state
> > >>> >>> > > > > > >>>> connector
> > >>> >>> > > > > > >>>>>>> uses
> > >>> >>> > > > > > >>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>> read
> > >>> >>> > > > > > >>>>>>>>>>>>> data)
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> and unless somebody show me the
> > >>> opposite
> > >>> >>> > > > > > >>>> this
> > >>> >>> > > > > > >>>>>>> FLIP
> > >>> >>> > > > > > >>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>>> going
> > >>> >>> > > > > > >>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> address
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> this to avoid feature creep. Our
> > users
> > >>> >>> > > > > > >> are
> > >>> >>> > > > > > >>>>> also
> > >>> >>> > > > > > >>>>>>>>>>> interested
> > >>> >>> > > > > > >>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>> TTL
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> so
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sooner or later we're going to
> > expose
> > >>> it,
> > >>> >>> > > > > > >>>> this
> > >>> >>> > > > > > >>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>> matter
> > >>> >>> > > > > > >>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> scheduling.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> > >>> >>> > > > > > >>>>>>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Not sure I understand your point
> at
> > >>> all
> > >>> >>> > > > > > >>>>> related
> > >>> >>> > > > > > >>>>>>>>>>>> StateCatalog.
> > >>> >>> > > > > > >>>>>>>>>>>>>>> First
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> all
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I can't agree more that
> StateCatalog
> > >>> is
> > >>> >>> > > > > > >>>> needed
> > >>> >>> > > > > > >>>>>>> and
> > >>> >>> > > > > > >>>>>>>>> is a
> > >>> >>> > > > > > >>>>>>>>>>>>> planned
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> building
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> block in an upcoming
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> FLIP but not sure how can it help
> > >>> now? No
> > >>> >>> > > > > > >>>>> matter
> > >>> >>> > > > > > >>>>>>>>> what,
> > >>> >>> > > > > > >>>>>>>>>>> your
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> knowledge
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> essential when we add
> StateCatalog.
> > >>> Let
> > >>> >>> > > > > > >> me
> > >>> >>> > > > > > >>>>>>> expose
> > >>> >>> > > > > > >>>>>>>> my
> > >>> >>> > > > > > >>>>>>>>>>>>>>> understanding
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> area:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * First we need create table
> > >>> statements
> > >>> >>> > > > > > >> to
> > >>> >>> > > > > > >>>>>>> access
> > >>> >>> > > > > > >>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>> data
> > >>> >>> > > > > > >>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * When we have that then we can
> add
> > >>> >>> > > > > > >>>>> StateCatalog
> > >>> >>> > > > > > >>>>>>>>> which
> > >>> >>> > > > > > >>>>>>>>>>>> could
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> potentially
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> ease the life of users by for ex.
> > >>> giving
> > >>> >>> > > > > > >>>>>>>>> off-the-shelf
> > >>> >>> > > > > > >>>>>>>>>>>> tables
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> without
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sweating with create table
> > statements
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> User expectations:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See state data (this is
> fulfilled
> > >>> with
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>> existing
> > >>> >>> > > > > > >>>>>>>>>>>>>> connector)
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about state data
> like
> > >>> TTL
> > >>> >>> > > > > > >>>> (this
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>> added
> > >>> >>> > > > > > >>>>>>>>>>>>> as
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> column as you suggested since it
> > >>> belongs
> > >>> >>> > > > > > >> to
> > >>> >>> > > > > > >>>>> the
> > >>> >>> > > > > > >>>>>>>> data)
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about operators
> (this
> > >>> can
> > >>> >>> > > > > > >> be
> > >>> >>> > > > > > >>>>>>> added
> > >>> >>> > > > > > >>>>>>>>> from
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata)
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Important to highlight that state
> > data
> > >>> >>> > > > > > >>>> table
> > >>> >>> > > > > > >>>>>>> format
> > >>> >>> > > > > > >>>>>>>>>>> differs
> > >>> >>> > > > > > >>>>>>>>>>>>>> from
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata table format. Namely one
> > >>> table
> > >>> >>> > > > > > >> has
> > >>> >>> > > > > > >>>>> rows
> > >>> >>> > > > > > >>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>>> values
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> another has rows for operators,
> > right?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I think that's the reason why
> you've
> > >>> >>> > > > > > >>>>> pinpointed
> > >>> >>> > > > > > >>>>>>> out
> > >>> >>> > > > > > >>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> suggested
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata columns are somewhat
> > clunky.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> As a conclusion I agree to add
> > >>> >>> > > > > > >>>>> ${state-name}_ttl
> > >>> >>> > > > > > >>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>> column
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> later
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> on
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> since it belongs to the state
> value
> > >>> and
> > >>> >>> > > > > > >>>>> adding a
> > >>> >>> > > > > > >>>>>>>> new
> > >>> >>> > > > > > >>>>>>>>>>> table
> > >>> >>> > > > > > >>>>>>>>>>>>> type
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> (like
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> suggested similar to PG [1])
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for metadata. Please see how Spark
> > >>> does
> > >>> >>> > > > > > >>>> that
> > >>> >>> > > > > > >>>>> too
> > >>> >>> > > > > > >>>>>>>> [2].
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> If you have better approach then
> > >>> please
> > >>> >>> > > > > > >>>>>>> elaborate
> > >>> >>> > > > > > >>>>>>>>> with
> > >>> >>> > > > > > >>>>>>>>>>> more
> > >>> >>> > > > > > >>>>>>>>>>>>>>> details
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> help me to understand your point.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in
> TB
> > >>> >>> > > > > > >>>>> savepoints
> > >>> >>> > > > > > >>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>> number
> > >>> >>> > > > > > >>>>>>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> keys
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per
> > key
> > >>> >>> > > > > > >>>> state
> > >>> >>> > > > > > >>>>>>>> itself.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> > >>> as-is
> > >>> >>> > > > > > >>>> and
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>> handled
> > >>> >>> > > > > > >>>>>>>>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> separate
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> jira.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I've just created
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >> https://issues.apache.org/jira/browse/FLINK-37456.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> https://www.postgresql.org/docs/current/view-pg-tables.html
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [2]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> On Tue, Mar 11, 2025 at 3:55 AM
> > >>> Shengkai
> > >>> >>> > > > > > >>>> Fang
> > >>> >>> > > > > > >>>>> <
> > >>> >>> > > > > > >>>>>>>>>>>>>> fskm...@gmail.com
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your
> > response.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Thank you for addressing the
> > >>> >>> > > > > > >> limitations
> > >>> >>> > > > > > >>>>> here.
> > >>> >>> > > > > > >>>>>>>>>>> However, I
> > >>> >>> > > > > > >>>>>>>>>>>>>>> believe
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be beneficial to further clarify
> > the
> > >>> >>> > > > > > >> API
> > >>> >>> > > > > > >>>> in
> > >>> >>> > > > > > >>>>>>> this
> > >>> >>> > > > > > >>>>>>>>> FLIP
> > >>> >>> > > > > > >>>>>>>>>>>>>> regarding
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> how
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> can specify the TTL column.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> One potential approach that comes
> > to
> > >>> >>> > > > > > >>>> mind is
> > >>> >>> > > > > > >>>>>>>> using
> > >>> >>> > > > > > >>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>>>>>> standardized
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> naming
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> convention such as
> > ${state-name}_ttl
> > >>> >>> > > > > > >> for
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>> column
> > >>> >>> > > > > > >>>>>>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> defines
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the TTL value. In terms of
> > >>> >>> > > > > > >>>> implementation,
> > >>> >>> > > > > > >>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> listReadableMetadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> function could:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Read the table’s columns and
> > >>> >>> > > > > > >>>>> configuration,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Extract all defined state
> names,
> > >>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 3. Return a structured list of
> > >>> metadata
> > >>> >>> > > > > > >>>>>>> entries
> > >>> >>> > > > > > >>>>>>>>>>> formatted
> > >>> >>> > > > > > >>>>>>>>>>>>> as
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> ${state-name}_ttl.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> WDYT?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> > >>> >>> > > > > > >>>>>>>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Introducing a new connector type
> at
> > >>> >>> > > > > > >> this
> > >>> >>> > > > > > >>>>> stage
> > >>> >>> > > > > > >>>>>>>> may
> > >>> >>> > > > > > >>>>>>>>>>>>>>> unnecessarily
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> complicate
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the system. Given that every
> table
> > >>> >>> > > > > > >>>> already
> > >>> >>> > > > > > >>>>>>>> belongs
> > >>> >>> > > > > > >>>>>>>>>> to a
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Catalog,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> which
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> designed to provide a Factory for
> > >>> >>> > > > > > >>>> building
> > >>> >>> > > > > > >>>>>>> source
> > >>> >>> > > > > > >>>>>>>>> or
> > >>> >>> > > > > > >>>>>>>>>>> sink
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> connectors, I
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> propose integrating a dedicated
> > >>> >>> > > > > > >>>> StateCatalog
> > >>> >>> > > > > > >>>>>>>>> instead.
> > >>> >>> > > > > > >>>>>>>>>>>> This
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> approach
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> allow us to:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Leverage the Catalog’s
> existing
> > >>> >>> > > > > > >>>>>>> capabilities
> > >>> >>> > > > > > >>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>> manage
> > >>> >>> > > > > > >>>>>>>>>>>>> TTL
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> (e.g., state names and TTL logic)
> > >>> >>> > > > > > >> without
> > >>> >>> > > > > > >>>>>>>>> duplicating
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> functionality.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Provide a unified interface
> for
> > >>> >>> > > > > > >>>> connector
> > >>> >>> > > > > > >>>>>>>>>>>> instantiation
> > >>> >>> > > > > > >>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handling through the Catalog’s
> > >>> Factory
> > >>> >>> > > > > > >>>>>>> pattern.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Would this design decision better
> > >>> align
> > >>> >>> > > > > > >>>> with
> > >>> >>> > > > > > >>>>>>> our
> > >>> >>> > > > > > >>>>>>>>>>>>>> architecture’s
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> extensibility and reduce
> > redundancy?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in
> TB
> > >>> >>> > > > > > >>>>>>> savepoints
> > >>> >>> > > > > > >>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> number
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the
> per
> > >>> key
> > >>> >>> > > > > > >>>>> state
> > >>> >>> > > > > > >>>>>>>>> itself.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good
> feature
> > >>> >>> > > > > > >> as-is
> > >>> >>> > > > > > >>>>> and
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>> handled
> > >>> >>> > > > > > >>>>>>>>>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> +1 for a separate jira.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> > >>> >>> > > > > > >> gabor.g.somo...@gmail.com
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>>>>>>>> 于2025年3月10日周一
> > >>> >>> > > > > > >>>>>>>>>>>>>>> 19:05写道:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Please see my comments inline.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 3, 2025 at 7:07 AM
> > >>> >>> > > > > > >> Shengkai
> > >>> >>> > > > > > >>>>>>> Fang <
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> fskm...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your the
> > >>> >>> > > > > > >> FLIP.
> > >>> >>> > > > > > >>>> I
> > >>> >>> > > > > > >>>>>>> have
> > >>> >>> > > > > > >>>>>>>>> some
> > >>> >>> > > > > > >>>>>>>>>>>>>> questions
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> about
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> FLIP:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> How can users retrieve the
> state
> > >>> >>> > > > > > >> TTL
> > >>> >>> > > > > > >>>>>>>>>> (Time-to-Live)
> > >>> >>> > > > > > >>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>> each
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> value
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> column?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> From my understanding of the
> > >>> >>> > > > > > >> current
> > >>> >>> > > > > > >>>>>>> design,
> > >>> >>> > > > > > >>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>> seems
> > >>> >>> > > > > > >>>>>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> functionality is not supported.
> > >>> >>> > > > > > >> Could
> > >>> >>> > > > > > >>>>> you
> > >>> >>> > > > > > >>>>>>>>> clarify
> > >>> >>> > > > > > >>>>>>>>>>> if
> > >>> >>> > > > > > >>>>>>>>>>>>>> there
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> are
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> plans
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> address this limitation?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Since the state processor API is
> > not
> > >>> >>> > > > > > >>>> yet
> > >>> >>> > > > > > >>>>>>>> exposing
> > >>> >>> > > > > > >>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> information
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> would require several steps.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> First, the state processor API
> > >>> >>> > > > > > >> support
> > >>> >>> > > > > > >>>>>>> needs to
> > >>> >>> > > > > > >>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>> added
> > >>> >>> > > > > > >>>>>>>>>>>>>>> which
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> then
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exposed on the SQL API.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This is definitely a future
> > >>> >>> > > > > > >> improvement
> > >>> >>> > > > > > >>>>>>> which
> > >>> >>> > > > > > >>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>> useful
> > >>> >>> > > > > > >>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handled
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> in a separate jira.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata
> > >>> >>> > > > > > >> Column
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> The metadata information
> > described
> > >>> >>> > > > > > >> in
> > >>> >>> > > > > > >>>>> the
> > >>> >>> > > > > > >>>>>>>> FLIP
> > >>> >>> > > > > > >>>>>>>>>>>> appears
> > >>> >>> > > > > > >>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> intended
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> describe the state files stored
> > at
> > >>> >>> > > > > > >> a
> > >>> >>> > > > > > >>>>>>> specific
> > >>> >>> > > > > > >>>>>>>>>>>> location.
> > >>> >>> > > > > > >>>>>>>>>>>>>> To
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> me,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> concept
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> aligns more closely with system
> > >>> >>> > > > > > >>>> tables
> > >>> >>> > > > > > >>>>>>> like
> > >>> >>> > > > > > >>>>>>>>>>> pg_tables
> > >>> >>> > > > > > >>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> PostgreSQL
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> or
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> the INFORMATION_SCHEMA in MySQL
> > >>> >>> > > > > > >> [2].
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Adding a new connector with
> > >>> >>> > > > > > >>>>>>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>> is a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> possibility
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> where
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> can create such functionality.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> I'm not against that, just want
> to
> > >>> >>> > > > > > >>>> have a
> > >>> >>> > > > > > >>>>>>>> common
> > >>> >>> > > > > > >>>>>>>>>>>>> agreement
> > >>> >>> > > > > > >>>>>>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> we
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> like to move that direction.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (As a side note not just PG but
> > >>> Spark
> > >>> >>> > > > > > >>>> also
> > >>> >>> > > > > > >>>>>>> has
> > >>> >>> > > > > > >>>>>>>>>>> similar
> > >>> >>> > > > > > >>>>>>>>>>>>>>> approach
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> and I
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> basically like the idea).
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would go that direction
> > >>> >>> > > > > > >> savepoint
> > >>> >>> > > > > > >>>>>>>> metadata
> > >>> >>> > > > > > >>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>> reached
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> way
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> that one row would represent
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> an operator with it's values
> > >>> >>> > > > > > >> something
> > >>> >>> > > > > > >>>>> like
> > >>> >>> > > > > > >>>>>>>> this:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ame      │id       │ash
> │sm
> > >>> >>> > > > > > >>>>>>> │elism
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │atesCount│orStateSi│tesSizeI│
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │         │
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │zeInBytes│nBytes  │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │Source:  │datagen-s│47aee9439│2
> > >>> >>> > > > > > >>>>> │128
> > >>> >>> > > > > > >>>>>>>>>> │2
> > >>> >>> > > > > > >>>>>>>>>>>>>>> │16
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │546     │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │datagen-s│ource-uid│4d6ea26e2│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ource    │         │d544bef0a│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │37bb5    │
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │long-udf-│long-udf-│6ed3f40bf│2
> > >>> >>> > > > > > >>>>> │128
> > >>> >>> > > > > > >>>>>>>>>> │2
> > >>> >>> > > > > > >>>>>>>>>>>>>>> │0
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> │0
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>     │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │with-mast│with-mast│f3c8dfcdf│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │er-hook  │er-hook-u│cb95128a1│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │id       │018f1    │
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │value-pro│value-pro│ca4f5fe9a│2
> > >>> >>> > > > > > >>>>> │128
> > >>> >>> > > > > > >>>>>>>>>> │2
> > >>> >>> > > > > > >>>>>>>>>>>>>>> │0
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │40726   │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │cess     │cess-uid │637b656f0│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │9ea78b3e7│
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │         │         │a15b9    │
> > >>> >>> > > > > > >>>> │
> > >>> >>> > > > > > >>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>    │
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This table can then be joined
> with
> > >>> >>> > > > > > >> the
> > >>> >>> > > > > > >>>>>>> actually
> > >>> >>> > > > > > >>>>>>>>>>>> existing
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> `savepoint`
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> connector created tables based
> on
> > >>> UID
> > >>> >>> > > > > > >>>> hash
> > >>> >>> > > > > > >>>>>>>> (which
> > >>> >>> > > > > > >>>>>>>>>> is
> > >>> >>> > > > > > >>>>>>>>>>>>> unique
> > >>> >>> > > > > > >>>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> always
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exists).
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This would mean that the already
> > >>> >>> > > > > > >>>> existing
> > >>> >>> > > > > > >>>>>>> table
> > >>> >>> > > > > > >>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>> need
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> only a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> single
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> metadata column which is the UID
> > >>> >>> > > > > > >> hash.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> WDYT?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> @zakelly, plz share your
> thoughts
> > >>> >>> > > > > > >> too.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> If we opt to use metadata
> > columns,
> > >>> >>> > > > > > >>>> every
> > >>> >>> > > > > > >>>>>>>> record
> > >>> >>> > > > > > >>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>> table
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> would
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> end
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> up
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> having identical values for
> these
> > >>> >>> > > > > > >>>>> columns
> > >>> >>> > > > > > >>>>>>>>> (please
> > >>> >>> > > > > > >>>>>>>>>>>>> correct
> > >>> >>> > > > > > >>>>>>>>>>>>>>> me
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> if
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I’m
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> mistaken). On the other hand,
> the
> > >>> >>> > > > > > >>>> state
> > >>> >>> > > > > > >>>>>>>>> connector
> > >>> >>> > > > > > >>>>>>>>>>>>>> requires
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> users
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> specify
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> an operator UID or operator UID
> > >>> >>> > > > > > >> hash,
> > >>> >>> > > > > > >>>>>>> after
> > >>> >>> > > > > > >>>>>>>>> which
> > >>> >>> > > > > > >>>>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>>>>> outputs
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> user-defined
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> values in its records. This
> > >>> >>> > > > > > >> approach
> > >>> >>> > > > > > >>>>> feels
> > >>> >>> > > > > > >>>>>>>>>> somewhat
> > >>> >>> > > > > > >>>>>>>>>>>>>>> redundant
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> me.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would add a new
> > >>> >>> > > > > > >>>> `savepoint-metadata`
> > >>> >>> > > > > > >>>>>>>>>> connector
> > >>> >>> > > > > > >>>>>>>>>>>> then
> > >>> >>> > > > > > >>>>>>>>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> addressed.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On the other hand UID and UID
> hash
> > >>> >>> > > > > > >> are
> > >>> >>> > > > > > >>>>>>> having
> > >>> >>> > > > > > >>>>>>>>>>> either-or
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> relationship
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> from
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> config perspective,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> so when a user provides the UID
> > then
> > >>> >>> > > > > > >>>>> he/she
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>>> interested
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> hash
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> for further calculations
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (the whole Flink internals are
> > >>> >>> > > > > > >>>> depending
> > >>> >>> > > > > > >>>>> on
> > >>> >>> > > > > > >>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>> hash).
> > >>> >>> > > > > > >>>>>>>>>>>>>>> Printing
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> out
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> human readable UID
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> is an explicit requirement from
> > the
> > >>> >>> > > > > > >>>> user
> > >>> >>> > > > > > >>>>>>> side
> > >>> >>> > > > > > >>>>>>>>>> because
> > >>> >>> > > > > > >>>>>>>>>>>>>> hashes
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> are
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> not
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> human
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> readable.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 3. Handling LIST and MAP States
> > in
> > >>> >>> > > > > > >>>> the
> > >>> >>> > > > > > >>>>>>> State
> > >>> >>> > > > > > >>>>>>>>>>>> Connector
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> I have concerns about how the
> > >>> >>> > > > > > >> current
> > >>> >>> > > > > > >>>>>>> design
> > >>> >>> > > > > > >>>>>>>>>>> handles
> > >>> >>> > > > > > >>>>>>>>>>>>> LIST
> > >>> >>> > > > > > >>>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> MAP
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> states.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Specifically, the state
> connector
> > >>> >>> > > > > > >>>> uses
> > >>> >>> > > > > > >>>>>>> Flink
> > >>> >>> > > > > > >>>>>>>>>> SQL’s
> > >>> >>> > > > > > >>>>>>>>>>>> MAP
> > >>> >>> > > > > > >>>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> ARRAY
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> types,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> which implies that it attempts
> to
> > >>> >>> > > > > > >>>> load
> > >>> >>> > > > > > >>>>>>> entire
> > >>> >>> > > > > > >>>>>>>>> MAP
> > >>> >>> > > > > > >>>>>>>>>>> or
> > >>> >>> > > > > > >>>>>>>>>>>>> LIST
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> states
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> into
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> memory.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> However, in many real-world
> > >>> >>> > > > > > >>>> scenarios,
> > >>> >>> > > > > > >>>>>>> these
> > >>> >>> > > > > > >>>>>>>>>> states
> > >>> >>> > > > > > >>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>> grow
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> very
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> large.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Typically, the state API
> > addresses
> > >>> >>> > > > > > >>>> this
> > >>> >>> > > > > > >>>>> by
> > >>> >>> > > > > > >>>>>>>>>>> providing
> > >>> >>> > > > > > >>>>>>>>>>>> an
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> iterator
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> traverse elements within the
> > state
> > >>> >>> > > > > > >>>>>>>>> incrementally.
> > >>> >>> > > > > > >>>>>>>>>>> I’m
> > >>> >>> > > > > > >>>>>>>>>>>>>>> unsure
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> whether
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> I’ve
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> missed something in FLIP-496 or
> > >>> >>> > > > > > >>>>> FLIP-512,
> > >>> >>> > > > > > >>>>>>> but
> > >>> >>> > > > > > >>>>>>>>> it
> > >>> >>> > > > > > >>>>>>>>>>>> seems
> > >>> >>> > > > > > >>>>>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> current
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> design might struggle with
> > >>> >>> > > > > > >>>> scalability
> > >>> >>> > > > > > >>>>> in
> > >>> >>> > > > > > >>>>>>>> such
> > >>> >>> > > > > > >>>>>>>>>>> cases.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> You see it good, the current
> > >>> >>> > > > > > >>>>> implementation
> > >>> >>> > > > > > >>>>>>>> keeps
> > >>> >>> > > > > > >>>>>>>>>>> state
> > >>> >>> > > > > > >>>>>>>>>>>>>> for a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> single
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> key
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> in
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> memory.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Back in the days we've
> considered
> > >>> >>> > > > > > >> this
> > >>> >>> > > > > > >>>>>>>> potential
> > >>> >>> > > > > > >>>>>>>>>>> issue
> > >>> >>> > > > > > >>>>>>>>>>>>> and
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> concluded
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> this is not necessarily
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> needed for the initial version
> and
> > >>> >>> > > > > > >> can
> > >>> >>> > > > > > >>>> be
> > >>> >>> > > > > > >>>>>>> done
> > >>> >>> > > > > > >>>>>>>>> as a
> > >>> >>> > > > > > >>>>>>>>>>>> later
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> improvement.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in
> TB
> > >>> >>> > > > > > >>>>>>> savepoints
> > >>> >>> > > > > > >>>>>>>>> that
> > >>> >>> > > > > > >>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>> number
> > >>> >>> > > > > > >>>>>>>>>>>>>>>> of
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the
> per
> > >>> key
> > >>> >>> > > > > > >>>>> state
> > >>> >>> > > > > > >>>>>>>>> itself.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good
> feature
> > >>> >>> > > > > > >> as-is
> > >>> >>> > > > > > >>>>> and
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>>>> handled
> > >>> >>> > > > > > >>>>>>>>>>>>>>> in a
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Shengkai
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > https://www.postgresql.org/docs/current/view-pg-tables.html
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [2]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> > >>> >>> > > > > > >>>>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>> 于2025年3月3日周一
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 02:00写道:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> Hi Zakelly,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> In order to shoot for
> simplicity
> > >>> >>> > > > > > >>>>>>> `METADATA
> > >>> >>> > > > > > >>>>>>>>>>> VIRTUAL`
> > >>> >>> > > > > > >>>>>>>>>>>>> as
> > >>> >>> > > > > > >>>>>>>>>>>>>>> key
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> words
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> definition is the target.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> When it's not super complex
> the
> > >>> >>> > > > > > >>>> latter
> > >>> >>> > > > > > >>>>>>> can
> > >>> >>> > > > > > >>>>>>>> be
> > >>> >>> > > > > > >>>>>>>>>>> added
> > >>> >>> > > > > > >>>>>>>>>>>>>> too.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 2, 2025 at 3:37 PM
> > >>> >>> > > > > > >>>> Zakelly
> > >>> >>> > > > > > >>>>>>> Lan
> > >>> >>> > > > > > >>>>>>>> <
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> zakelly....@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Gabor,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> +1 for this.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Will the metadata column use
> > >>> >>> > > > > > >>>>> `METADATA
> > >>> >>> > > > > > >>>>>>>>>> VIRTUAL`
> > >>> >>> > > > > > >>>>>>>>>>>> as
> > >>> >>> > > > > > >>>>>>>>>>>>>> key
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> words
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> definition, or `METADATA FROM
> > >>> >>> > > > > > >> xxx
> > >>> >>> > > > > > >>>>>>>> VIRTUAL`
> > >>> >>> > > > > > >>>>>>>>>> for
> > >>> >>> > > > > > >>>>>>>>>>>>>>> renaming,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> just
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> like
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> the
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Kafka table?
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Zakelly
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Mar 1, 2025 at
> 1:31 PM
> > >>> >>> > > > > > >>>> Gabor
> > >>> >>> > > > > > >>>>>>>>> Somogyi
> > >>> >>> > > > > > >>>>>>>>>> <
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> gabor.g.somo...@gmail.com>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi All,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to start a
> > >>> >>> > > > > > >> discussion
> > >>> >>> > > > > > >>>> of
> > >>> >>> > > > > > >>>>>>>>> FLIP-512:
> > >>> >>> > > > > > >>>>>>>>>>> Add
> > >>> >>> > > > > > >>>>>>>>>>>>>> meta
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> information
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> SQL
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> state connector [1].
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Feel free to add your
> > >>> >>> > > > > > >> thoughts
> > >>> >>> > > > > > >>>> to
> > >>> >>> > > > > > >>>>>>> make
> > >>> >>> > > > > > >>>>>>>>> this
> > >>> >>> > > > > > >>>>>>>>>>>>> feature
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> better.
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> BR,
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> G
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>>
> > >>> >>> > > > > > >>>>>>>>
> > >>> >>> > > > > > >>>>>>>
> > >>> >>> > > > > > >>>>>>
> > >>> >>> > > > > > >>>>>
> > >>> >>> > > > > > >>>>
> > >>> >>> > > > > > >>>
> > >>> >>> > > > > > >>
> > >>> >>> > > > > >
> > >>> >>> > > > > >
> > >>> >>> > > > >
> > >>> >>> > > >
> > >>> >>> > >
> > >>> >>> >
> > >>> >>>
> > >>> >>
> > >>>
> > >>
> >
>

Reply via email to