Re: [DISCUSS] FLIP-512: Add meta information to SQL state connector

Shengkai Fang Wed, 26 Mar 2025 19:16:08 -0700

Many thanks for your reminder, Leonard. Here's the link I mentioned[1].

Best,
Shengkai


[1] https://github.com/apache/flink/pull/26358

Leonard Xu <[email protected]> 于2025年3月27日周四 10:05写道：

> Your link is broken, Shengkai
>
> Best,
> Leonard
>
> > 2025年3月27日 10:01，Shengkai Fang <[email protected]> 写道：
> >
> > Hi, All.
> >
> > I write a simple demo to illustrate my idea. Hope this helps.
> >
> > Best,
> > Shengkai
> >
> >
> https://github.com/apache/flink/compare/master...fsk119:flink:example?expand=1
> >
> > Gabor Somogyi <[email protected]> 于2025年3月26日周三 15:54写道：
> >
> >>> I'm fine with a seperate SQL connector for metadata, so maybe we could
> >> update the FLIP about our discussion?
> >>
> >> Sorry, I've forgotten this part. Yeah, no matter we choose I'm going to
> >> update the FLIP.
> >>
> >> G
> >>
> >>
> >> On Wed, Mar 26, 2025 at 8:51 AM Gabor Somogyi <
> [email protected]>
> >> wrote:
> >>
> >>> Hi All,
> >>>
> >>> I've also lack of the knowledge of PTF so I've read just the motivation
> >>> part:
> >>>
> >>> "The SQL 2016 standard introduced a way of defining custom SQL
> operators
> >>> defined by ISO/IEC 19075-7:2021 (Part 7: Polymorphic table functions).
> >>> ~200 pages define how this new kind of function can consume and produce
> >>> tables with various execution properties.
> >>> Unfortunately, this part of the standard is not publicly available."
> >>>
> >>> Of course we can take a look at some examples but do we really want to
> >>> expose state data with this construct
> >>> which is described in ~200 pages and part of the standard is not
> publicly
> >>> available? 🙂
> >>> I mean the dataset is couple of rows and the use-case is join with
> >> another
> >>> table like with state data.
> >>> If somebody can give advantages I would buy that but from my limited
> >>> understanding this would be an overkill here.
> >>>
> >>> BR,
> >>> G
> >>>
> >>>
> >>> On Wed, Mar 26, 2025 at 8:28 AM Gyula Fóra <[email protected]>
> wrote:
> >>>
> >>>> Hi Zakelly , Shengkai!
> >>>>
> >>>> I don't know too much about PTFs, it would be interesting to see how
> the
> >>>> usage would look in practice.
> >>>>
> >>>> Do you have some mockup/example in mind how the PTF would look for
> >> example
> >>>> when want to:
> >>>> - Simply display/aggregate whats in the metadata
> >>>> - Join keyed state with some metadata columns
> >>>>
> >>>> Thanks
> >>>> Gyula
> >>>>
> >>>> On Wed, Mar 26, 2025 at 7:33 AM Zakelly Lan <[email protected]>
> >>>> wrote:
> >>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> I'm fine with a seperate SQL connector for metadata, so maybe we
> could
> >>>>> update the FLIP about our discussion? And Shengkai provides a PTF
> >>>>> implementation, does that also meet the requirement?
> >>>>>
> >>>>>
> >>>>> Best,
> >>>>> Zakelly
> >>>>>
> >>>>> On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <
> >>>> [email protected]>
> >>>>> wrote:
> >>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> @Zakelly: Gyula summarised it correctly what I meant so please treat
> >>>> the
> >>>>>> content as mine.
> >>>>>> As an addition I'm not against to add CLI at all, I'm just stating
> >>>> that
> >>>>> in
> >>>>>> some cases like this, users would like to have
> >>>>>> a self-serving solution where they can provide SQL statements which
> >>>> can
> >>>>>> trigger alerts automatically.
> >>>>>>
> >>>>>> My personal opinion is that CLI would be beneficial for several
> >>>> cases. A
> >>>>>> good example is when users want to restart job
> >>>>>> from specific Kafka offsets which are persisted in a savepoint. For
> >>>> such
> >>>>>> scenario users are more than happy since they
> >>>>>> expect manual intervention with full control. So all in all one can
> >>>> count
> >>>>>> on my +1 when CLI FLIP would come up...
> >>>>>>
> >>>>>> BR,
> >>>>>> G
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <[email protected]>
> >>>> wrote:
> >>>>>>
> >>>>>>> Hi!
> >>>>>>>
> >>>>>>> @Zakelly Lan <[email protected]>
> >>>>>>> I think what Gabor means is that users want to have predefined SQL
> >>>>> scripts
> >>>>>>> to perform state analysis tasks to debug/identify problems.
> >>>>>>> Such as write a SQL script that joins the metadata table with the
> >>>> state
> >>>>>>> and
> >>>>>>> do some analytics on it.
> >>>>>>>
> >>>>>>> If we have a meta table then the SQL script that can do this is
> >> fixed
> >>>>> and
> >>>>>>> users can trigger this on demand by simply providing a new
> >> savepoint
> >>>>> path.
> >>>>>>>
> >>>>>>> If we have a different mechanism to extract metadata that is not
> >> SQL
> >>>>>>> native
> >>>>>>> then manual steps need to be executed and a custom SQL script would
> >>>> need
> >>>>>>> to
> >>>>>>> be written that adds the manually extracted metadata into the
> >> script.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Gyula
> >>>>>>>
> >>>>>>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <[email protected]
> >>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> Thanks for your answers! Getting everyone aligned on this topic
> >> is
> >>>>>>>> challenging, but it’s definitely worth the effort since it will
> >>>> help
> >>>>>>>> streamline things moving forward.
> >>>>>>>>
> >>>>>>>> @Gabor are you saying that users are using some scripts to define
> >>>> the
> >>>>>>> SQL
> >>>>>>>> metadata connector and get the information, right? If so, would a
> >>>> CLI
> >>>>>>> tool
> >>>>>>>> be more convenient? It's easy to invoke and can get the result
> >>>>> swiftly.
> >>>>>>> And
> >>>>>>>> there should be some other systems to track the checkpoint
> >> lineage
> >>>> and
> >>>>>>>> analyze if there are outliers in metadata (e.g. state size of one
> >>>>>>> operator)
> >>>>>>>> right? Well, maybe I missed something so please correct me if I'm
> >>>>> wrong.
> >>>>>>>>
> >>>>>>>> I think the overall vision in Flink SQL is to provide a SQL
> >> native
> >>>>>>>>> environment where we can serve complex use-cases like you would
> >>>>> expect
> >>>>>>>> in a
> >>>>>>>>> regular database.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> @Gyula Well, this is a good point. From the perspective of
> >>>>> comprehensive
> >>>>>>>> SQL experience, I'd +1 for treating metadata as data. Although I
> >>>> doubt
> >>>>>>> if
> >>>>>>>> there is a need for processing metadata, I won't be against a
> >>>> separate
> >>>>>>>> connector.
> >>>>>>>>
> >>>>>>>> Regarding the CLI tool, I still think it’s worth implementing.
> >>>> Such a
> >>>>>>> tool
> >>>>>>>> could provide savepoint information before resuming from a
> >>>> savepoint,
> >>>>>>> which
> >>>>>>>> would enhance the user experience in CLI-based workflows. It
> >> would
> >>>> be
> >>>>>>> good
> >>>>>>>> if someone could implement this feature. We shouldn’t worry about
> >>>>>>> whether
> >>>>>>>> this tool might be retired in the future. Regardless of the
> >>>> SQL-based
> >>>>>>>> solution we eventually adopt, this capability will remain
> >> essential
> >>>>> for
> >>>>>>> CLI
> >>>>>>>> users. This is another topic.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Zakelly
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang <
> >> [email protected]>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Hi.
> >>>>>>>>>
> >>>>>>>>> After reading the doc[1], I think Spark provides a function for
> >>>>> users
> >>>>>>> to
> >>>>>>>>> consume the metadata from the savepoint.  In Flink SQL, similar
> >>>>>>>>> functionality is implemented through Polymorphic Table
> >> Functions
> >>>>>>> (PTF) as
> >>>>>>>>> proposed in FLIP-440[2]. Below is a code example[3]
> >> illustrating
> >>>>> this
> >>>>>>>>> concept:
> >>>>>>>>>
> >>>>>>>>> ```
> >>>>>>>>>    public static class ScalarArgsFunction extends
> >>>>>>>>> TestProcessTableFunctionBase {
> >>>>>>>>>        public void eval(Integer i, Boolean b) {
> >>>>>>>>>            collectObjects(i, b);
> >>>>>>>>>        }
> >>>>>>>>>    }
> >>>>>>>>> ```
> >>>>>>>>>
> >>>>>>>>> ```
> >>>>>>>>> INSERT INTO sink SELECT * FROM f(i => 42, b => CAST('TRUE' AS
> >>>>>>> BOOLEAN))
> >>>>>>>>> ``
> >>>>>>>>>
> >>>>>>>>> So we can add a builtin function named `read_state_metadata` to
> >>>> read
> >>>>>>>>> savepoint data.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Shengkai
> >>>>>>>>>
> >>>>>>>>> [1]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL
> >>>>>>>>> [2]
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
> >>>>>>>>> [3]
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140
> >>>>>>>>>
> >>>>>>>>> Gyula Fóra <[email protected]> 于2025年3月19日周三 18:37写道：
> >>>>>>>>>
> >>>>>>>>>> Hi All!
> >>>>>>>>>>
> >>>>>>>>>> Thank you for the answers and concerns from everyone.
> >>>>>>>>>>
> >>>>>>>>>> On the CLI vs State Metadata Connector/Table question I would
> >>>> also
> >>>>>>> like
> >>>>>>>>> to
> >>>>>>>>>> step back a little and look at the bigger picture.
> >>>>>>>>>>
> >>>>>>>>>> I think the overall vision in Flink SQL is to provide a SQL
> >>>> native
> >>>>>>>>>> environment where we can serve complex use-cases like you
> >> would
> >>>>>>> expect
> >>>>>>>>> in a
> >>>>>>>>>> regular database.
> >>>>>>>>>> Most features, developments in the recent years have gone
> >> this
> >>>>> way.
> >>>>>>>>>>
> >>>>>>>>>> The State Metadata Table would be a natural and
> >> straightforward
> >>>>> fit
> >>>>>>>> here.
> >>>>>>>>>> So from my side, +1 for that.
> >>>>>>>>>>
> >>>>>>>>>> However I could understand if we are not ready to add a new
> >>>>>>>>>> connector/format due to maintenance concerns (and in general
> >>>>> concern
> >>>>>>>>> about
> >>>>>>>>>> the design).
> >>>>>>>>>> If that's the issue then we should spend more time on the
> >>>> design
> >>>>> to
> >>>>>>> get
> >>>>>>>>>> comfortable with the approach and seek feedback from the
> >> wider
> >>>>>>>> community
> >>>>>>>>>>
> >>>>>>>>>> I am -1 for the CLI/tooling approach as that will not provide
> >>>> the
> >>>>>>>>>> featureset we are looking for that is not already covered by
> >>>> the
> >>>>>>> Java
> >>>>>>>>>> connector. And that approach would come with the same
> >>>> maintenance
> >>>>>>>>>> implications.
> >>>>>>>>>>
> >>>>>>>>>> Cheers
> >>>>>>>>>> Gyula
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 19, 2025 at 11:24 AM Gabor Somogyi <
> >>>>>>>>> [email protected]>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Zaklely, Shengkai
> >>>>>>>>>>>
> >>>>>>>>>>> Several topics are going on so adding gist answers to them.
> >>>> When
> >>>>>>> some
> >>>>>>>>>> topic
> >>>>>>>>>>> is not touched please highlight it.
> >>>>>>>>>>>
> >>>>>>>>>>> @Shengkai: I've read through all the previous FLIPs related
> >>>>>>> catalogs
> >>>>>>>>> and
> >>>>>>>>>> if
> >>>>>>>>>>> we would like to keep the concepts there
> >>>>>>>>>>> then one-to-one mapping relationship between savepoint and
> >>>>> catalog
> >>>>>>>> is a
> >>>>>>>>>>> reasonable direction. In short I'm happy that
> >>>>>>>>>>> you've highlighted this and agree as a whole. I've written
> >> it
> >>>>> down
> >>>>>>>>>>> previously, just want to double confirm that state catalog
> >> is
> >>>>>>>>>>> essential and planned. When we reach this point then your
> >>>> input
> >>>>> is
> >>>>>>>> more
> >>>>>>>>>>> than welcome.
> >>>>>>>>>>>
> >>>>>>>>>>> @Zakelly: We've tried the CLI and separate library
> >> approaches
> >>>>> with
> >>>>>>>>> users
> >>>>>>>>>>> already and these are not something which is welcome
> >> because
> >>>> of
> >>>>>>> the
> >>>>>>>>>>> following:
> >>>>>>>>>>> * Users want to have automated tasks and not manual
> >>>> CLI/library
> >>>>>>>> output
> >>>>>>>>>>> parsing. This can be hacked around but our experience is
> >>>>> negative
> >>>>>>> on
> >>>>>>>>> this
> >>>>>>>>>>> because it's just brittle.
> >>>>>>>>>>> * From development perspective It's way much bigger effort
> >>>> than
> >>>>> a
> >>>>>>>>>> connector
> >>>>>>>>>>> (hard to test, packaging/version handling is and extra
> >> layer
> >>>> of
> >>>>>>>>>> complexity,
> >>>>>>>>>>> external FS authentication is pain for users, expecting
> >> them
> >>>> to
> >>>>>>>>> download
> >>>>>>>>>>> savepoints also)
> >>>>>>>>>>> * Purely personal opinion but if we would find better ways
> >>>> later
> >>>>>>> then
> >>>>>>>>>>> retire a CLI is not more lightweight than retire a
> >> connector
> >>>>>>>>>>>
> >>>>>>>>>>>> It would be great if you give some examples on how user
> >>>> could
> >>>>>>>>> leverage
> >>>>>>>>>>> the separate connector to process the metadata.
> >>>>>>>>>>>
> >>>>>>>>>>> The most simplest cases:
> >>>>>>>>>>> * give me the overgroving state uids
> >>>>>>>>>>> * give me the not known (new or renamed) state uids
> >>>>>>>>>>> * give me the state uids where state size drastically
> >> dropped
> >>>>>>> compare
> >>>>>>>>> to
> >>>>>>>>>> a
> >>>>>>>>>>> previous savepoint (accidental state loss)
> >>>>>>>>>>>
> >>>>>>>>>>> Since it was mentioned: as a general offtopic teaser, yeah
> >> it
> >>>>>>> would
> >>>>>>>> be
> >>>>>>>>>> good
> >>>>>>>>>>> to have some sort of checkpoint/savepoint lineage or
> >> however
> >>>> we
> >>>>>>> call
> >>>>>>>>> it.
> >>>>>>>>>>> Since we've not yet reached this point there are no
> >> technical
> >>>>>>>> details,
> >>>>>>>>>> it's
> >>>>>>>>>>> more like a vision. It's a common pattern that
> >>>>>>>>>>> jobs are physically running but somehow the state
> >> processing
> >>>> is
> >>>>>>> stuck
> >>>>>>>>> and
> >>>>>>>>>>> it would be good to add some way to find it out
> >>>> automatically.
> >>>>>>>>>>> The important saying here is automation and not manual
> >>>>> evaluation
> >>>>>>>> since
> >>>>>>>>>>> handling 10k+ jobs is just not allowing that.
> >>>>>>>>>>>
> >>>>>>>>>>> BR,
> >>>>>>>>>>> G
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Wed, Mar 19, 2025 at 6:46 AM Shengkai Fang <
> >>>>> [email protected]>
> >>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Hi, All.
> >>>>>>>>>>>>
> >>>>>>>>>>>> About State Catalog, I want to share more thoughts about
> >>>> this.
> >>>>>>>>>>>>
> >>>>>>>>>>>> In the initial design concept, I understood that a
> >>>> savepoint
> >>>>>>> and a
> >>>>>>>>>> state
> >>>>>>>>>>>> catalog have a one-to-one mapping relationship. Each
> >>>> operator
> >>>>>>>>>> corresponds
> >>>>>>>>>>>> to a database, and the state of each operator is
> >>>> represented
> >>>>> as
> >>>>>>>>>>> individual
> >>>>>>>>>>>> tables. The rationale behind this design is:
> >>>>>>>>>>>>
> >>>>>>>>>>>> *State Diversity*: An operator may involve multiple types
> >>>> of
> >>>>>>>> states.
> >>>>>>>>>> For
> >>>>>>>>>>>> example, in our VVR design, a "multi-join" operator uses
> >>>> keyed
> >>>>>>>> states
> >>>>>>>>>> for
> >>>>>>>>>>>> two input streams and a broadcast state for the third
> >>>> stream.
> >>>>>>> This
> >>>>>>>>>> makes
> >>>>>>>>>>> it
> >>>>>>>>>>>> challenging to represent all states of an operator
> >> within a
> >>>>>>> single
> >>>>>>>>>> table.
> >>>>>>>>>>>> *Scalability*: Internally, an operator might have
> >> multiple
> >>>>> keyed
> >>>>>>>>> states
> >>>>>>>>>>>> (e.g., value state and list state). However, large list
> >>>> states
> >>>>>>> may
> >>>>>>>>> not
> >>>>>>>>>>> fit
> >>>>>>>>>>>> entirely in memory. To address this, we recommend
> >>>> implementing
> >>>>>>> each
> >>>>>>>>>> state
> >>>>>>>>>>>> as a separate table.
> >>>>>>>>>>>>
> >>>>>>>>>>>> To resolve the loosely coupled relationships between
> >>>> operator
> >>>>>>>> states,
> >>>>>>>>>> we
> >>>>>>>>>>>> propose embedding predefined views within the catalog.
> >>>> These
> >>>>>>> views
> >>>>>>>>>>> simplify
> >>>>>>>>>>>> user understanding of operator implementations and
> >> provide
> >>>> a
> >>>>>>> more
> >>>>>>>>>>> intuitive
> >>>>>>>>>>>> perspective. For instance, a join operator may have
> >>>> multiple
> >>>>>>> state
> >>>>>>>>>>>> implementations (depending on whether the join key
> >> includes
> >>>>>>> unique
> >>>>>>>>>>>> attributes), but users primarily care about the data
> >>>>> associated
> >>>>>>>> with
> >>>>>>>>> a
> >>>>>>>>>>>> specific join key across input streams.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Returning to the one-to-one mapping between savepoints
> >> and
> >>>>>>>> catalogs,
> >>>>>>>>> we
> >>>>>>>>>>> aim
> >>>>>>>>>>>> to manage multiple user state catalogs through a catalog
> >>>>> store.
> >>>>>>>> When
> >>>>>>>>> a
> >>>>>>>>>>> user
> >>>>>>>>>>>> triggers a savepoint for a job on the platform:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1. The platform sends a REST request to the JobManager.
> >>>>>>>>>>>> 2. Simultaneously, it registers a new state catalog in
> >> the
> >>>>>>> catalog
> >>>>>>>>>> store,
> >>>>>>>>>>>> enabling immediate analysis of state data on the
> >> platform.
> >>>>>>>>>>>> 3. Deleting a savepoint would also trigger the removal of
> >>>> its
> >>>>>>>>>> associated
> >>>>>>>>>>>> catalog.
> >>>>>>>>>>>>
> >>>>>>>>>>>> This vision assumes that states are self-describing or
> >>>> that a
> >>>>>>> state
> >>>>>>>>>>>> metaservice is introduced to analyze savepoint
> >> structures.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> How can users create logic to identify differences
> >>>> between
> >>>>>>>> multiple
> >>>>>>>>>>>> savepoints?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Since savepoints and state catalogs are one-to-one
> >> mapped,
> >>>>> users
> >>>>>>>> can
> >>>>>>>>>>> query
> >>>>>>>>>>>> metadata via their respective catalogs. For example:
> >>>>>>>>>>>>
> >>>>>>>>>>>> 1.
> >>>>> `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>`
> >>>>>>>>>> provides
> >>>>>>>>>>>> operator-specific metadata (e.g., state size, type).
> >>>>>>>>>>>> 2. Comparing metadata tables (e.g., schema versions,
> >> state
> >>>>> entry
> >>>>>>>>>> counts)
> >>>>>>>>>>>> across catalogs reveals structural or quantitative
> >>>>> differences.
> >>>>>>>>>>>> 3. For deeper analysis, users could write SQL queries to
> >>>>> compare
> >>>>>>>>>> specific
> >>>>>>>>>>>> state partitions or leverage the metaservice to track
> >> state
> >>>>>>>> evolution
> >>>>>>>>>>>> (e.g., added/removed operators, modified state
> >>>>> configurations).
> >>>>>>>>>>>>
> >>>>>>>>>>>> If we plan to introduce a state catalog in the future, I
> >>>> would
> >>>>>>> lean
> >>>>>>>>>>> toward
> >>>>>>>>>>>> using metadata tables. If a utility tool can address the
> >>>>>>> challenges
> >>>>>>>>> we
> >>>>>>>>>>>> face, could we avoid introducing an additional connector?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>
> >>>>>>>>>>>> Gyula Fóra <[email protected]> 于2025年3月17日周一 20:25写道：
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi All!
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Without going into too much detail here are my 2 cents
> >>>>>>> regarding
> >>>>>>>>> the
> >>>>>>>>>>>>> virtual column / catalog metadata / table (connector)
> >>>>>>> discussion
> >>>>>>>>> for
> >>>>>>>>>>> the
> >>>>>>>>>>>>> State metadata.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> State metadata such as the types of states, their
> >>>>> properties,
> >>>>>>>>> names,
> >>>>>>>>>>>> sizes
> >>>>>>>>>>>>> etc are all valuable information that can be used to
> >>>> enrich
> >>>>>>> the
> >>>>>>>>>>>>> computations we do on state.
> >>>>>>>>>>>>> We can either analyze it standalone (such as discover
> >>>>>>> anomalies,
> >>>>>>>>> for
> >>>>>>>>>>>> large
> >>>>>>>>>>>>> jobs with many states), across multiple savepoints
> >>>> (discover
> >>>>>>> how
> >>>>>>>>>> state
> >>>>>>>>>>>>> changed over time) or by joining it with keyed or
> >>>> non-keyed
> >>>>>>> state
> >>>>>>>>>> data
> >>>>>>>>>>> to
> >>>>>>>>>>>>> serve more complex queries on the state.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The only solution that seems to serve all these
> >> use-cases
> >>>>> and
> >>>>>>>>>>>> requirements
> >>>>>>>>>>>>> in a straightforward and SQL canonical way is to simply
> >>>>> expose
> >>>>>>>> the
> >>>>>>>>>>> state
> >>>>>>>>>>>>> metadata as a separate table. This is a metadata table
> >>>> but
> >>>>> you
> >>>>>>>> can
> >>>>>>>>>> also
> >>>>>>>>>>>>> think of it as data table, it makes no practical
> >>>> difference
> >>>>>>> here.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Once we have a catalog later, the catalog can offer
> >> this
> >>>>> table
> >>>>>>>> out
> >>>>>>>>> of
> >>>>>>>>>>> the
> >>>>>>>>>>>>> box, the same way databases provide metadata tables.
> >> For
> >>>>> this
> >>>>>>> to
> >>>>>>>>> work
> >>>>>>>>>>>>> however we need another, simpler connector that creates
> >>>> this
> >>>>>>>> table.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> +1 for state metadata as a separate connector/table,
> >>>> instead
> >>>>>>> of
> >>>>>>>>>> adding
> >>>>>>>>>>>>> virtual columns and adhoc catalog metadata that is hard
> >>>> to
> >>>>> use
> >>>>>>>> in a
> >>>>>>>>>>> large
> >>>>>>>>>>>>> number of queries.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>> Gyula
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Mon, Mar 17, 2025 at 12:44 PM Gabor Somogyi <
> >>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I’m planning on adding this, and we may collaborate
> >>>> on
> >>>>> it
> >>>>>>> in
> >>>>>>>>> the
> >>>>>>>>>>>>> future.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> +1 on this, just ping me.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> After some code digging and POC all I can say that
> >> with
> >>>>>>> heavy
> >>>>>>>>>> effort
> >>>>>>>>>>> we
> >>>>>>>>>>>>> can
> >>>>>>>>>>>>>> maybe add such changes that we're able to show
> >> metadata
> >>>>> of a
> >>>>>>>>>>> savepoint
> >>>>>>>>>>>>> from
> >>>>>>>>>>>>>> catalog.
> >>>>>>>>>>>>>> I'm not against that but from user perspective this
> >> has
> >>>>>>> limited
> >>>>>>>>>>> value,
> >>>>>>>>>>>>> let
> >>>>>>>>>>>>>> me explain why.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> From high level perspective I see the following
> >> which I
> >>>>> see
> >>>>>>>>>> agreement
> >>>>>>>>>>>> on:
> >>>>>>>>>>>>>> * We should have a catalog which is representing one
> >> or
> >>>>> more
> >>>>>>>> jobs
> >>>>>>>>>>>>> savepoint
> >>>>>>>>>>>>>> data set (future plan)
> >>>>>>>>>>>>>> * Savepoints should be able to be registered in the
> >>>>> catalog
> >>>>>>>> which
> >>>>>>>>>> are
> >>>>>>>>>>>>> then
> >>>>>>>>>>>>>> databases (future plan)
> >>>>>>>>>>>>>> * There must be a possiblity to create tables from
> >>>>> databases
> >>>>>>>>> where
> >>>>>>>>>>>> users
> >>>>>>>>>>>>>> can read state data (exists already)
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> In terms of metadata, If I understand correctly then
> >>>> the
> >>>>>>>>> suggested
> >>>>>>>>>>>>> approach
> >>>>>>>>>>>>>> would be to access
> >>>>>>>>>>>>>> it from the catalog describe command, right? Adding
> >>>> that
> >>>>>>> info
> >>>>>>>>> when
> >>>>>>>>>>>>> specific
> >>>>>>>>>>>>>> database describe command
> >>>>>>>>>>>>>> is executed could be done.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> The question is for instance how can users create
> >> such
> >>>> a
> >>>>>>> logic
> >>>>>>>>> that
> >>>>>>>>>>>> tells
> >>>>>>>>>>>>>> them what is
> >>>>>>>>>>>>>> the difference between multiple savepoints?
> >>>>>>>>>>>>>> Just to give some examples:
> >>>>>>>>>>>>>> * per operator size changes between savepoints
> >>>>>>>>>>>>>> * show values from operator data where state size
> >>>> reaches
> >>>>> a
> >>>>>>>>>> boundary
> >>>>>>>>>>>>>> * in general "find which checkpoint ruined things" is
> >>>>> quite
> >>>>>>>>> common
> >>>>>>>>>>>>> pattern
> >>>>>>>>>>>>>> What I would like to highlight here is that from
> >> Flink
> >>>>>>> point of
> >>>>>>>>>> view
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> metadata can be
> >>>>>>>>>>>>>> considered as a static side output information but
> >> for
> >>>>> users
> >>>>>>>>> these
> >>>>>>>>>>>> values
> >>>>>>>>>>>>>> are actual real data
> >>>>>>>>>>>>>> where logic is planned to build around.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> The metadata is more like one-time information
> >>>> instead
> >>>>> of
> >>>>>>> a
> >>>>>>>>>>> streaming
> >>>>>>>>>>>>>> data that changes all
> >>>>>>>>>>>>>> the time, so a single connector seems to be an
> >>>> overkill.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> State data is also static within a savepoint and
> >> that's
> >>>>> the
> >>>>>>>>> reason
> >>>>>>>>>>> why
> >>>>>>>>>>>>> the
> >>>>>>>>>>>>>> state processor API is working in batch mode.
> >>>>>>>>>>>>>> When we handle multiple checkpoints in a streaming
> >>>> fashion
> >>>>>>> then
> >>>>>>>>>> this
> >>>>>>>>>>>> can
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>> viewed from another angle.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We can come up with more lightweight solution other
> >>>> than a
> >>>>>>> new
> >>>>>>>>>>>> connector
> >>>>>>>>>>>>>> but enforcing users to parse the catalog
> >>>>>>>>>>>>>> describe command output in order to compare multiple
> >>>>>>> savepoints
> >>>>>>>>>>> doesn't
> >>>>>>>>>>>>>> sound smooth user experience.
> >>>>>>>>>>>>>> Honestly I've no other idea how exposing metadata as
> >>>> real
> >>>>>>> user
> >>>>>>>>> data
> >>>>>>>>>>> so
> >>>>>>>>>>>>>> waiting on other approaches.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Thu, Mar 13, 2025 at 2:44 AM Shengkai Fang <
> >>>>>>>> [email protected]
> >>>>>>>>>>
> >>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Looking forward to hearing the good news!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Gabor Somogyi <[email protected]>
> >>>> 于2025年3月12日周三
> >>>>>>>>> 22:24写道：
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks for both the valuable input!
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Let me take a closer look at the suggestions,
> >> like
> >>>> the
> >>>>>>>>> Catalog
> >>>>>>>>>>>>>>> capabilities
> >>>>>>>>>>>>>>>> and possibility of embedding TypeInformation or
> >>>>>>>>>>>>>>>> StateDescriptor metadata directly into the raw
> >>>> state
> >>>>>>>> files...
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 8:17 AM Shengkai Fang <
> >>>>>>>>>> [email protected]
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks for Zakelly's clarification.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> +1 to delay the discussion about this.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I’d like to share my perspective on the State
> >>>>> Catalog
> >>>>>>>>>> proposal.
> >>>>>>>>>>>>> While
> >>>>>>>>>>>>>>>>> introducing this capability is beneficial,
> >> there
> >>>> is
> >>>>> a
> >>>>>>>>>> blocker:
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> current
> >>>>>>>>>>>>>>>>> StateBackend architecture does not permit
> >>>> operators
> >>>>> to
> >>>>>>>>> encode
> >>>>>>>>>>>>>>>>> TypeInformation into the state—it only
> >> preserves
> >>>> the
> >>>>>>>>>>> Serializer.
> >>>>>>>>>>>>> This
> >>>>>>>>>>>>>>>>> limitation creates an asymmetry, as operators
> >>>> alone
> >>>>>>>> retain
> >>>>>>>>>>>>> knowledge
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> data structure’s schema.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> To address this, I suggest allowing operators
> >> to
> >>>>> embed
> >>>>>>>>>>>>>> TypeInformation
> >>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>> StateDescriptor metadata directly into the raw
> >>>> state
> >>>>>>>> files.
> >>>>>>>>>>> Such
> >>>>>>>>>>>> a
> >>>>>>>>>>>>>>> design
> >>>>>>>>>>>>>>>>> would enable the Catalog to:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 1. Parse state files and programmatically
> >> derive
> >>>> the
> >>>>>>>> schema
> >>>>>>>>>> and
> >>>>>>>>>>>>>>>> structural
> >>>>>>>>>>>>>>>>> guarantees for each state.
> >>>>>>>>>>>>>>>>> 2. Leverage existing Flink Table utilities,
> >> such
> >>>> as
> >>>>>>>>>>>>>>>>> LegacyTypeInfoDataTypeConverter (in
> >>>>>>>>>>>>>>> org.apache.flink.table.types.utils),
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> bridge TypeInformation and DataType
> >> conversions.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> If we can not store the TypeInformation or
> >>>>>>>> StateDescriptor
> >>>>>>>>>> into
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>> raw
> >>>>>>>>>>>>>>>>> state files, I am +1 for this FLIP to use
> >>>> metadata
> >>>>>>> column
> >>>>>>>>> to
> >>>>>>>>>>>>> retrieve
> >>>>>>>>>>>>>>>>> information.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Zakelly Lan <[email protected]>
> >>>> 于2025年3月12日周三
> >>>>>>>> 12:43写道：
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi Gabor and Shengkai,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks for sharing your thoughts! This is a
> >>>> long
> >>>>>>>>> discussion
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> sorry
> >>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>> the late reply (I'm busy catching up with
> >>>> release
> >>>>>>> 2.0
> >>>>>>>>> these
> >>>>>>>>>>>>> days).
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Let me first clarify your thoughts to ensure
> >> I
> >>>>>>>> understand
> >>>>>>>>>>>>>> correctly.
> >>>>>>>>>>>>>>>>> IIUC,
> >>>>>>>>>>>>>>>>>> there is no persistent configuration for
> >> state
> >>>> TTL
> >>>>>>> in
> >>>>>>>> the
> >>>>>>>>>>>>>> checkpoint.
> >>>>>>>>>>>>>>>>> While
> >>>>>>>>>>>>>>>>>> you can infer that TTL is enabled by reading
> >>>> the
> >>>>>>>>>> serializer,
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> checkpoint
> >>>>>>>>>>>>>>>>>> itself only stores the last access time for
> >>>> each
> >>>>>>> value.
> >>>>>>>>> So
> >>>>>>>>>>> the
> >>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>> thing
> >>>>>>>>>>>>>>>>>> we can show is the last access time for each
> >>>>> value.
> >>>>>>> But
> >>>>>>>>> it
> >>>>>>>>>> is
> >>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> required
> >>>>>>>>>>>>>>>>>> for all state backends to store this, as they
> >>>> may
> >>>>>>>>> directly
> >>>>>>>>>>>> store
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> expired time. This will also increase the
> >>>>>>> difficulty of
> >>>>>>>>>>>>>>> implementation
> >>>>>>>>>>>>>>>> &
> >>>>>>>>>>>>>>>>>> maintenance.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This once again reiterates the importance of
> >>>>> unified
> >>>>>>>>>> metadata
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>> checkpoints. I’m planning on adding this, and
> >>>> we
> >>>>> may
> >>>>>>>>>>>> collaborate
> >>>>>>>>>>>>> on
> >>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> the future.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I'm not in favor of adding a new connector
> >> for
> >>>>>>>> metadata.
> >>>>>>>>>> The
> >>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> more like one-time information instead of a
> >>>>>>> streaming
> >>>>>>>>> data
> >>>>>>>>>>> that
> >>>>>>>>>>>>>>> changes
> >>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>> the time, so a single connector seems to be
> >> an
> >>>>>>>> overkill.
> >>>>>>>>> It
> >>>>>>>>>>> is
> >>>>>>>>>>>>> not
> >>>>>>>>>>>>>>> easy
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> withdraw a connector if we have a better
> >>>> solution
> >>>>> in
> >>>>>>>>>> future.
> >>>>>>>>>>>> I'm
> >>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>> familiar with current Catalog capabilities,
> >>>> and if
> >>>>>>> it
> >>>>>>>>> could
> >>>>>>>>>>>>> extract
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> show some operator-level information from
> >>>>> savepoint,
> >>>>>>>> that
> >>>>>>>>>>> would
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> great.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> If the Catalog can't do that, I would
> >> consider
> >>>> the
> >>>>>>>>> current
> >>>>>>>>>>> FLIP
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> be a
> >>>>>>>>>>>>>>>>>> compromise solution.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> And if we have that unified metadata for
> >>>>>>>>>> checkpoint/savepoint
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> future,
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>> may directly register savepoint in catalog,
> >> and
> >>>>>>> create
> >>>>>>>> a
> >>>>>>>>>>> source
> >>>>>>>>>>>>>>> without
> >>>>>>>>>>>>>>>>>> specifying complex columns, as well as
> >> describe
> >>>>> the
> >>>>>>>>>> savepoint
> >>>>>>>>>>>>>> catalog
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> get the metadata. That's a good solution in
> >> my
> >>>>> mind.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>> Zakelly
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 10:35 AM Shengkai
> >> Fang
> >>>> <
> >>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>>>>>> `savepoint-metadata`
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I would argue against introducing a new
> >>>>> connector
> >>>>>>>> type
> >>>>>>>>>>> named
> >>>>>>>>>>>>>>>>>>> savepoint-metadata, as the existing Catalog
> >>>>>>> mechanism
> >>>>>>>>> can
> >>>>>>>>>>>>>>> inherently
> >>>>>>>>>>>>>>>>>>> provide the necessary connector factory
> >>>>>>> capabilities.
> >>>>>>>>>> I’ve
> >>>>>>>>>>>>>> detailed
> >>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>> proposal in branch[1]. Please take a moment
> >>>> to
> >>>>>>> review
> >>>>>>>>> it.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> If we introduce a connector named
> >>>>>>>> `savepoint-metadata`,
> >>>>>>>>>> it
> >>>>>>>>>>>>> means
> >>>>>>>>>>>>>>> user
> >>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>> create a temporary table with connector
> >>>>>>>>>>> `savepoint-metadata`
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> connector needs to check whether table
> >>>> schema is
> >>>>>>> same
> >>>>>>>>> to
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> schema
> >>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>> proposed in the FLIP. On the other hand,
> >> it's
> >>>>> not
> >>>>>>>> easy
> >>>>>>>>>> work
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>>>> others
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> users a metadata table with same schema.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Gabor Somogyi <[email protected]>
> >>>>>>>>> 于2025年3月11日周二
> >>>>>>>>>>>>> 16:56写道：
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> From directional perspective I agree your
> >>>> idea
> >>>>>>> how
> >>>>>>>> it
> >>>>>>>>>> can
> >>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>> implemented.
> >>>>>>>>>>>>>>>>>>>> Previously I've mentioned that TTL
> >>>> information
> >>>>>>> is
> >>>>>>>> not
> >>>>>>>>>>>> exposed
> >>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> state
> >>>>>>>>>>>>>>>>>>>> processor API (which the SQL state
> >>>> connector
> >>>>>>> uses
> >>>>>>>> to
> >>>>>>>>>> read
> >>>>>>>>>>>>> data)
> >>>>>>>>>>>>>>>>>>>> and unless somebody show me the opposite
> >>>> this
> >>>>>>> FLIP
> >>>>>>>> is
> >>>>>>>>>> not
> >>>>>>>>>>>>> going
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> address
> >>>>>>>>>>>>>>>>>>>> this to avoid feature creep. Our users
> >> are
> >>>>> also
> >>>>>>>>>>> interested
> >>>>>>>>>>>> in
> >>>>>>>>>>>>>> TTL
> >>>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>>>>>> sooner or later we're going to expose it,
> >>>> this
> >>>>>>> is
> >>>>>>>>>> matter
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> scheduling.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>>>>>>> `savepoint-metadata`
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Not sure I understand your point at all
> >>>>> related
> >>>>>>>>>>>> StateCatalog.
> >>>>>>>>>>>>>>> First
> >>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>> I can't agree more that StateCatalog is
> >>>> needed
> >>>>>>> and
> >>>>>>>>> is a
> >>>>>>>>>>>>> planned
> >>>>>>>>>>>>>>>>>> building
> >>>>>>>>>>>>>>>>>>>> block in an upcoming
> >>>>>>>>>>>>>>>>>>>> FLIP but not sure how can it help now? No
> >>>>> matter
> >>>>>>>>> what,
> >>>>>>>>>>> your
> >>>>>>>>>>>>>>>> knowledge
> >>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>> essential when we add StateCatalog. Let
> >> me
> >>>>>>> expose
> >>>>>>>> my
> >>>>>>>>>>>>>>> understanding
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>> area:
> >>>>>>>>>>>>>>>>>>>> * First we need create table statements
> >> to
> >>>>>>> access
> >>>>>>>>> state
> >>>>>>>>>>>> data
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>>>>>> * When we have that then we can add
> >>>>> StateCatalog
> >>>>>>>>> which
> >>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>>> potentially
> >>>>>>>>>>>>>>>>>>>> ease the life of users by for ex. giving
> >>>>>>>>> off-the-shelf
> >>>>>>>>>>>> tables
> >>>>>>>>>>>>>>>> without
> >>>>>>>>>>>>>>>>>>>> sweating with create table statements
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> User expectations:
> >>>>>>>>>>>>>>>>>>>> * See state data (this is fulfilled with
> >>>> the
> >>>>>>>> existing
> >>>>>>>>>>>>>> connector)
> >>>>>>>>>>>>>>>>>>>> * See metadata about state data like TTL
> >>>> (this
> >>>>>>> can
> >>>>>>>> be
> >>>>>>>>>>> added
> >>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>>>>>> column as you suggested since it belongs
> >> to
> >>>>> the
> >>>>>>>> data)
> >>>>>>>>>>>>>>>>>>>> * See metadata about operators (this can
> >> be
> >>>>>>> added
> >>>>>>>>> from
> >>>>>>>>>>>>>>>>>>> savepoint-metadata)
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Important to highlight that state data
> >>>> table
> >>>>>>> format
> >>>>>>>>>>> differs
> >>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>> state
> >>>>>>>>>>>>>>>>>>>> metadata table format. Namely one table
> >> has
> >>>>> rows
> >>>>>>>> for
> >>>>>>>>>>> state
> >>>>>>>>>>>>>> values
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>> another has rows for operators, right?
> >>>>>>>>>>>>>>>>>>>> I think that's the reason why you've
> >>>>> pinpointed
> >>>>>>> out
> >>>>>>>>>> that
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> suggested
> >>>>>>>>>>>>>>>>>>>> metadata columns are somewhat clunky.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> As a conclusion I agree to add
> >>>>> ${state-name}_ttl
> >>>>>>>>>> metadata
> >>>>>>>>>>>>>> column
> >>>>>>>>>>>>>>>>> later
> >>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>> since it belongs to the state value and
> >>>>> adding a
> >>>>>>>> new
> >>>>>>>>>>> table
> >>>>>>>>>>>>> type
> >>>>>>>>>>>>>>>> (like
> >>>>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>>>> suggested similar to PG [1])
> >>>>>>>>>>>>>>>>>>>> for metadata. Please see how Spark does
> >>>> that
> >>>>> too
> >>>>>>>> [2].
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> If you have better approach then please
> >>>>>>> elaborate
> >>>>>>>>> with
> >>>>>>>>>>> more
> >>>>>>>>>>>>>>> details
> >>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>> help me to understand your point.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>>>> savepoints
> >>>>>>>> that
> >>>>>>>>>> the
> >>>>>>>>>>>>> number
> >>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> keys
> >>>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>>> state
> >>>>>>>> itself.
> >>>>>>>>>>>>>>>>>>>>> But again, this is a good feature as-is
> >>>> and
> >>>>>>> can
> >>>>>>>> be
> >>>>>>>>>>>> handled
> >>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>> jira.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I've just created
> >>>>>>>>>>>>>>>>
> >> https://issues.apache.org/jira/browse/FLINK-37456.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>> https://www.postgresql.org/docs/current/view-pg-tables.html
> >>>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Tue, Mar 11, 2025 at 3:55 AM Shengkai
> >>>> Fang
> >>>>> <
> >>>>>>>>>>>>>> [email protected]
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your response.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thank you for addressing the
> >> limitations
> >>>>> here.
> >>>>>>>>>>> However, I
> >>>>>>>>>>>>>>> believe
> >>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>> be beneficial to further clarify the
> >> API
> >>>> in
> >>>>>>> this
> >>>>>>>>> FLIP
> >>>>>>>>>>>>>> regarding
> >>>>>>>>>>>>>>>> how
> >>>>>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>>>>> can specify the TTL column.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> One potential approach that comes to
> >>>> mind is
> >>>>>>>> using
> >>>>>>>>> a
> >>>>>>>>>>>>>>> standardized
> >>>>>>>>>>>>>>>>>>> naming
> >>>>>>>>>>>>>>>>>>>>> convention such as ${state-name}_ttl
> >> for
> >>>> the
> >>>>>>>>> metadata
> >>>>>>>>>>>>> column
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>> defines
> >>>>>>>>>>>>>>>>>>>>> the TTL value. In terms of
> >>>> implementation,
> >>>>> the
> >>>>>>>>>>>>>>>> listReadableMetadata
> >>>>>>>>>>>>>>>>>>>>> function could:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 1. Read the table’s columns and
> >>>>> configuration,
> >>>>>>>>>>>>>>>>>>>>> 2. Extract all defined state names, and
> >>>>>>>>>>>>>>>>>>>>> 3. Return a structured list of metadata
> >>>>>>> entries
> >>>>>>>>>>> formatted
> >>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>> ${state-name}_ttl.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with
> >>>>>>>>> `savepoint-metadata`
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Introducing a new connector type at
> >> this
> >>>>> stage
> >>>>>>>> may
> >>>>>>>>>>>>>>> unnecessarily
> >>>>>>>>>>>>>>>>>>>> complicate
> >>>>>>>>>>>>>>>>>>>>> the system. Given that every table
> >>>> already
> >>>>>>>> belongs
> >>>>>>>>>> to a
> >>>>>>>>>>>>>>> Catalog,
> >>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>> designed to provide a Factory for
> >>>> building
> >>>>>>> source
> >>>>>>>>> or
> >>>>>>>>>>> sink
> >>>>>>>>>>>>>>>>>> connectors, I
> >>>>>>>>>>>>>>>>>>>>> propose integrating a dedicated
> >>>> StateCatalog
> >>>>>>>>> instead.
> >>>>>>>>>>>> This
> >>>>>>>>>>>>>>>> approach
> >>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>> allow us to:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 1. Leverage the Catalog’s existing
> >>>>>>> capabilities
> >>>>>>>> to
> >>>>>>>>>>> manage
> >>>>>>>>>>>>> TTL
> >>>>>>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>>>>>>> (e.g., state names and TTL logic)
> >> without
> >>>>>>>>> duplicating
> >>>>>>>>>>>>>>>>> functionality.
> >>>>>>>>>>>>>>>>>>>>> 2. Provide a unified interface for
> >>>> connector
> >>>>>>>>>>>> instantiation
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>>>>>>> handling through the Catalog’s Factory
> >>>>>>> pattern.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Would this design decision better align
> >>>> with
> >>>>>>> our
> >>>>>>>>>>>>>> architecture’s
> >>>>>>>>>>>>>>>>>>>>> extensibility and reduce redundancy?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>>>>>> savepoints
> >>>>>>>>> that
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> number
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>> keys
> >>>>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>>>> state
> >>>>>>>>> itself.
> >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >> as-is
> >>>>> and
> >>>>>>> can
> >>>>>>>>> be
> >>>>>>>>>>>>> handled
> >>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> jira.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> +1 for a separate jira.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >> [email protected]
> >>>>>
> >>>>>>>>>>> 于2025年3月10日周一
> >>>>>>>>>>>>>>> 19:05写道：
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Hi Shengkai,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Please see my comments inline.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 3, 2025 at 7:07 AM
> >> Shengkai
> >>>>>>> Fang <
> >>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your the
> >> FLIP.
> >>>> I
> >>>>>>> have
> >>>>>>>>> some
> >>>>>>>>>>>>>> questions
> >>>>>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> FLIP:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns
> >>>>>>>>>>>>>>>>>>>>>>> How can users retrieve the state
> >> TTL
> >>>>>>>>>> (Time-to-Live)
> >>>>>>>>>>>> for
> >>>>>>>>>>>>>>> each
> >>>>>>>>>>>>>>>>>> value
> >>>>>>>>>>>>>>>>>>>>>> column?
> >>>>>>>>>>>>>>>>>>>>>>> From my understanding of the
> >> current
> >>>>>>> design,
> >>>>>>>> it
> >>>>>>>>>>> seems
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>>>> functionality is not supported.
> >> Could
> >>>>> you
> >>>>>>>>> clarify
> >>>>>>>>>>> if
> >>>>>>>>>>>>>> there
> >>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>> plans
> >>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> address this limitation?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Since the state processor API is not
> >>>> yet
> >>>>>>>> exposing
> >>>>>>>>>>> this
> >>>>>>>>>>>>>>>>> information
> >>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>>> would require several steps.
> >>>>>>>>>>>>>>>>>>>>>> First, the state processor API
> >> support
> >>>>>>> needs to
> >>>>>>>>> be
> >>>>>>>>>>>> added
> >>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>> then
> >>>>>>>>>>>>>>>>>>>>>> exposed on the SQL API.
> >>>>>>>>>>>>>>>>>>>>>> This is definitely a future
> >> improvement
> >>>>>>> which
> >>>>>>>> is
> >>>>>>>>>>> useful
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>> handled
> >>>>>>>>>>>>>>>>>>>>>> in a separate jira.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata
> >> Column
> >>>>>>>>>>>>>>>>>>>>>>> The metadata information described
> >> in
> >>>>> the
> >>>>>>>> FLIP
> >>>>>>>>>>>> appears
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>> intended
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> describe the state files stored at
> >> a
> >>>>>>> specific
> >>>>>>>>>>>> location.
> >>>>>>>>>>>>>> To
> >>>>>>>>>>>>>>>> me,
> >>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>>> concept
> >>>>>>>>>>>>>>>>>>>>>>> aligns more closely with system
> >>>> tables
> >>>>>>> like
> >>>>>>>>>>> pg_tables
> >>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> PostgreSQL
> >>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>>>>> the INFORMATION_SCHEMA in MySQL
> >> [2].
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Adding a new connector with
> >>>>>>>> `savepoint-metadata`
> >>>>>>>>>> is a
> >>>>>>>>>>>>>>>> possibility
> >>>>>>>>>>>>>>>>>>> where
> >>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>> can create such functionality.
> >>>>>>>>>>>>>>>>>>>>>> I'm not against that, just want to
> >>>> have a
> >>>>>>>> common
> >>>>>>>>>>>>> agreement
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>>> like to move that direction.
> >>>>>>>>>>>>>>>>>>>>>> (As a side note not just PG but Spark
> >>>> also
> >>>>>>> has
> >>>>>>>>>>> similar
> >>>>>>>>>>>>>>> approach
> >>>>>>>>>>>>>>>>>> and I
> >>>>>>>>>>>>>>>>>>>>>> basically like the idea).
> >>>>>>>>>>>>>>>>>>>>>> If we would go that direction
> >> savepoint
> >>>>>>>> metadata
> >>>>>>>>>> can
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>> reached
> >>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> way
> >>>>>>>>>>>>>>>>>>>>>> that one row would represent
> >>>>>>>>>>>>>>>>>>>>>> an operator with it's values
> >> something
> >>>>> like
> >>>>>>>> this:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│
> >>>>>>>>>>>>>>>>>>>>>> │ame      │id       │ash      │sm
> >>>>>>> │elism
> >>>>>>>>>>>>>>>>>>>>>> │atesCount│orStateSi│tesSizeI│
> >>>>>>>>>>>>>>>>>>>>>> │         │         │         │
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>> │zeInBytes│nBytes  │
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>>>>>>>>>>>>>>>>>>>>> │Source:  │datagen-s│47aee9439│2
> >>>>> │128
> >>>>>>>>>> │2
> >>>>>>>>>>>>>>> │16
> >>>>>>>>>>>>>>>>>>>>>> │546     │
> >>>>>>>>>>>>>>>>>>>>>> │datagen-s│ource-uid│4d6ea26e2│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │ource    │         │d544bef0a│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │         │         │37bb5    │
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>>>>>>>>>>>>>>>>>>>>> │long-udf-│long-udf-│6ed3f40bf│2
> >>>>> │128
> >>>>>>>>>> │2
> >>>>>>>>>>>>>>> │0
> >>>>>>>>>>>>>>>>>>>> │0
> >>>>>>>>>>>>>>>>>>>>>>     │
> >>>>>>>>>>>>>>>>>>>>>> │with-mast│with-mast│f3c8dfcdf│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │er-hook  │er-hook-u│cb95128a1│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │         │id       │018f1    │
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>>>>>>>>>>>>>>>>>>>>> │value-pro│value-pro│ca4f5fe9a│2
> >>>>> │128
> >>>>>>>>>> │2
> >>>>>>>>>>>>>>> │0
> >>>>>>>>>>>>>>>>>>>>>> │40726   │
> >>>>>>>>>>>>>>>>>>>>>> │cess     │cess-uid │637b656f0│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │         │         │9ea78b3e7│
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>> │         │         │a15b9    │
> >>>> │
> >>>>>>>>> │
> >>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>> │
> >>>>>>>>>>>>>>>>>>>>>>    │
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> This table can then be joined with
> >> the
> >>>>>>> actually
> >>>>>>>>>>>> existing
> >>>>>>>>>>>>>>>>>> `savepoint`
> >>>>>>>>>>>>>>>>>>>>>> connector created tables based on UID
> >>>> hash
> >>>>>>>> (which
> >>>>>>>>>> is
> >>>>>>>>>>>>> unique
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>> always
> >>>>>>>>>>>>>>>>>>>>>> exists).
> >>>>>>>>>>>>>>>>>>>>>> This would mean that the already
> >>>> existing
> >>>>>>> table
> >>>>>>>>>> would
> >>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>> only a
> >>>>>>>>>>>>>>>>>>>> single
> >>>>>>>>>>>>>>>>>>>>>> metadata column which is the UID
> >> hash.
> >>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>>>>> @zakelly, plz share your thoughts
> >> too.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> If we opt to use metadata columns,
> >>>> every
> >>>>>>>> record
> >>>>>>>>>> in
> >>>>>>>>>>>> the
> >>>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>> end
> >>>>>>>>>>>>>>>>>>>>> up
> >>>>>>>>>>>>>>>>>>>>>>> having identical values for these
> >>>>> columns
> >>>>>>>>> (please
> >>>>>>>>>>>>> correct
> >>>>>>>>>>>>>>> me
> >>>>>>>>>>>>>>>> if
> >>>>>>>>>>>>>>>>>> I’m
> >>>>>>>>>>>>>>>>>>>>>>> mistaken). On the other hand, the
> >>>> state
> >>>>>>>>> connector
> >>>>>>>>>>>>>> requires
> >>>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> specify
> >>>>>>>>>>>>>>>>>>>>>>> an operator UID or operator UID
> >> hash,
> >>>>>>> after
> >>>>>>>>> which
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>> outputs
> >>>>>>>>>>>>>>>>>>>>> user-defined
> >>>>>>>>>>>>>>>>>>>>>>> values in its records. This
> >> approach
> >>>>> feels
> >>>>>>>>>> somewhat
> >>>>>>>>>>>>>>> redundant
> >>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> me.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> If we would add a new
> >>>> `savepoint-metadata`
> >>>>>>>>>> connector
> >>>>>>>>>>>> then
> >>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>> addressed.
> >>>>>>>>>>>>>>>>>>>>>> On the other hand UID and UID hash
> >> are
> >>>>>>> having
> >>>>>>>>>>> either-or
> >>>>>>>>>>>>>>>>>> relationship
> >>>>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>>> config perspective,
> >>>>>>>>>>>>>>>>>>>>>> so when a user provides the UID then
> >>>>> he/she
> >>>>>>> can
> >>>>>>>>> be
> >>>>>>>>>>>>>> interested
> >>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> hash
> >>>>>>>>>>>>>>>>>>>>>> for further calculations
> >>>>>>>>>>>>>>>>>>>>>> (the whole Flink internals are
> >>>> depending
> >>>>> on
> >>>>>>> the
> >>>>>>>>>>> hash).
> >>>>>>>>>>>>>>> Printing
> >>>>>>>>>>>>>>>>> out
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> human readable UID
> >>>>>>>>>>>>>>>>>>>>>> is an explicit requirement from the
> >>>> user
> >>>>>>> side
> >>>>>>>>>> because
> >>>>>>>>>>>>>> hashes
> >>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> human
> >>>>>>>>>>>>>>>>>>>>>> readable.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 3. Handling LIST and MAP States in
> >>>> the
> >>>>>>> State
> >>>>>>>>>>>> Connector
> >>>>>>>>>>>>>>>>>>>>>>> I have concerns about how the
> >> current
> >>>>>>> design
> >>>>>>>>>>> handles
> >>>>>>>>>>>>> LIST
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> MAP
> >>>>>>>>>>>>>>>>>>>>> states.
> >>>>>>>>>>>>>>>>>>>>>>> Specifically, the state connector
> >>>> uses
> >>>>>>> Flink
> >>>>>>>>>> SQL’s
> >>>>>>>>>>>> MAP
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> ARRAY
> >>>>>>>>>>>>>>>>>>>> types,
> >>>>>>>>>>>>>>>>>>>>>>> which implies that it attempts to
> >>>> load
> >>>>>>> entire
> >>>>>>>>> MAP
> >>>>>>>>>>> or
> >>>>>>>>>>>>> LIST
> >>>>>>>>>>>>>>>>> states
> >>>>>>>>>>>>>>>>>>> into
> >>>>>>>>>>>>>>>>>>>>>>> memory.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> However, in many real-world
> >>>> scenarios,
> >>>>>>> these
> >>>>>>>>>> states
> >>>>>>>>>>>> can
> >>>>>>>>>>>>>>> grow
> >>>>>>>>>>>>>>>>> very
> >>>>>>>>>>>>>>>>>>>>> large.
> >>>>>>>>>>>>>>>>>>>>>>> Typically, the state API addresses
> >>>> this
> >>>>> by
> >>>>>>>>>>> providing
> >>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>> iterator
> >>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> traverse elements within the state
> >>>>>>>>> incrementally.
> >>>>>>>>>>> I’m
> >>>>>>>>>>>>>>> unsure
> >>>>>>>>>>>>>>>>>>> whether
> >>>>>>>>>>>>>>>>>>>>> I’ve
> >>>>>>>>>>>>>>>>>>>>>>> missed something in FLIP-496 or
> >>>>> FLIP-512,
> >>>>>>> but
> >>>>>>>>> it
> >>>>>>>>>>>> seems
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>> current
> >>>>>>>>>>>>>>>>>>>>>>> design might struggle with
> >>>> scalability
> >>>>> in
> >>>>>>>> such
> >>>>>>>>>>> cases.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> You see it good, the current
> >>>>> implementation
> >>>>>>>> keeps
> >>>>>>>>>>> state
> >>>>>>>>>>>>>> for a
> >>>>>>>>>>>>>>>>>> single
> >>>>>>>>>>>>>>>>>>>> key
> >>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>> memory.
> >>>>>>>>>>>>>>>>>>>>>> Back in the days we've considered
> >> this
> >>>>>>>> potential
> >>>>>>>>>>> issue
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>> concluded
> >>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>> this is not necessarily
> >>>>>>>>>>>>>>>>>>>>>> needed for the initial version and
> >> can
> >>>> be
> >>>>>>> done
> >>>>>>>>> as a
> >>>>>>>>>>>> later
> >>>>>>>>>>>>>>>>>>> improvement.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB
> >>>>>>> savepoints
> >>>>>>>>> that
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> number
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>> keys
> >>>>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per key
> >>>>> state
> >>>>>>>>> itself.
> >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature
> >> as-is
> >>>>> and
> >>>>>>> can
> >>>>>>>>> be
> >>>>>>>>>>>>> handled
> >>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>>> separate
> >>>>>>>>>>>>>>>>>>>>>> jira.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>> Shengkai
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>
> >>>>>>>>> https://www.postgresql.org/docs/current/view-pg-tables.html
> >>>>>>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Gabor Somogyi <
> >>>>> [email protected]>
> >>>>>>>>>>>> 于2025年3月3日周一
> >>>>>>>>>>>>>>>>> 02:00写道：
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi Zakelly,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> In order to shoot for simplicity
> >>>>>>> `METADATA
> >>>>>>>>>>> VIRTUAL`
> >>>>>>>>>>>>> as
> >>>>>>>>>>>>>>> key
> >>>>>>>>>>>>>>>>>> words
> >>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>> definition is the target.
> >>>>>>>>>>>>>>>>>>>>>>>> When it's not super complex the
> >>>> latter
> >>>>>>> can
> >>>>>>>> be
> >>>>>>>>>>> added
> >>>>>>>>>>>>>> too.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 2, 2025 at 3:37 PM
> >>>> Zakelly
> >>>>>>> Lan
> >>>>>>>> <
> >>>>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Gabor,
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> +1 for this.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Will the metadata column use
> >>>>> `METADATA
> >>>>>>>>>> VIRTUAL`
> >>>>>>>>>>>> as
> >>>>>>>>>>>>>> key
> >>>>>>>>>>>>>>>>> words
> >>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>> definition, or `METADATA FROM
> >> xxx
> >>>>>>>> VIRTUAL`
> >>>>>>>>>> for
> >>>>>>>>>>>>>>> renaming,
> >>>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> Kafka table?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>> Zakelly
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Mar 1, 2025 at 1:31 PM
> >>>> Gabor
> >>>>>>>>> Somogyi
> >>>>>>>>>> <
> >>>>>>>>>>>>>>>>>>>>>>> [email protected]>
> >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi All,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to start a
> >> discussion
> >>>> of
> >>>>>>>>> FLIP-512:
> >>>>>>>>>>> Add
> >>>>>>>>>>>>>> meta
> >>>>>>>>>>>>>>>>>>>> information
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> SQL
> >>>>>>>>>>>>>>>>>>>>>>>>>> state connector [1].
> >>>>>>>>>>>>>>>>>>>>>>>>>> Feel free to add your
> >> thoughts
> >>>> to
> >>>>>>> make
> >>>>>>>>> this
> >>>>>>>>>>>>> feature
> >>>>>>>>>>>>>>>>> better.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> BR,
> >>>>>>>>>>>>>>>>>>>>>>>>>> G
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] FLIP-512: Add meta information to SQL state connector

Reply via email to