Re: [DISCUSS] FLIP-512: Add meta information to SQL state connector

Zakelly Lan Tue, 25 Mar 2025 23:33:07 -0700

Hi everyone,

I'm fine with a seperate SQL connector for metadata, so maybe we could
update the FLIP about our discussion? And Shengkai provides a PTF
implementation, does that also meet the requirement?



Best,
Zakelly

On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <[email protected]>
wrote:

> Hi All,
>
> @Zakelly: Gyula summarised it correctly what I meant so please treat the
> content as mine.
> As an addition I'm not against to add CLI at all, I'm just stating that in
> some cases like this, users would like to have
> a self-serving solution where they can provide SQL statements which can
> trigger alerts automatically.
>
> My personal opinion is that CLI would be beneficial for several cases. A
> good example is when users want to restart job
> from specific Kafka offsets which are persisted in a savepoint. For such
> scenario users are more than happy since they
> expect manual intervention with full control. So all in all one can count
> on my +1 when CLI FLIP would come up...
>
> BR,
> G
>
>
> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <[email protected]> wrote:
>
>> Hi!
>>
>> @Zakelly Lan <[email protected]>
>> I think what Gabor means is that users want to have predefined SQL scripts
>> to perform state analysis tasks to debug/identify problems.
>> Such as write a SQL script that joins the metadata table with the state
>> and
>> do some analytics on it.
>>
>> If we have a meta table then the SQL script that can do this is fixed and
>> users can trigger this on demand by simply providing a new savepoint path.
>>
>> If we have a different mechanism to extract metadata that is not SQL
>> native
>> then manual steps need to be executed and a custom SQL script would need
>> to
>> be written that adds the manually extracted metadata into the script.
>>
>> Cheers,
>> Gyula
>>
>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <[email protected]>
>> wrote:
>>
>> > Hi all,
>> >
>> > Thanks for your answers! Getting everyone aligned on this topic is
>> > challenging, but it’s definitely worth the effort since it will help
>> > streamline things moving forward.
>> >
>> > @Gabor are you saying that users are using some scripts to define the
>> SQL
>> > metadata connector and get the information, right? If so, would a CLI
>> tool
>> > be more convenient? It's easy to invoke and can get the result swiftly.
>> And
>> > there should be some other systems to track the checkpoint lineage and
>> > analyze if there are outliers in metadata (e.g. state size of one
>> operator)
>> > right? Well, maybe I missed something so please correct me if I'm wrong.
>> >
>> > I think the overall vision in Flink SQL is to provide a SQL native
>> > > environment where we can serve complex use-cases like you would expect
>> > in a
>> > > regular database.
>> >
>> >
>> > @Gyula Well, this is a good point. From the perspective of comprehensive
>> > SQL experience, I'd +1 for treating metadata as data. Although I doubt
>> if
>> > there is a need for processing metadata, I won't be against a separate
>> > connector.
>> >
>> > Regarding the CLI tool, I still think it’s worth implementing. Such a
>> tool
>> > could provide savepoint information before resuming from a savepoint,
>> which
>> > would enhance the user experience in CLI-based workflows. It would be
>> good
>> > if someone could implement this feature. We shouldn’t worry about
>> whether
>> > this tool might be retired in the future. Regardless of the SQL-based
>> > solution we eventually adopt, this capability will remain essential for
>> CLI
>> > users. This is another topic.
>> >
>> >
>> > Best,
>> > Zakelly
>> >
>> >
>> > On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang <[email protected]>
>> wrote:
>> >
>> > > Hi.
>> > >
>> > > After reading the doc[1], I think Spark provides a function for users
>> to
>> > > consume the metadata from the savepoint.  In Flink SQL, similar
>> > > functionality is implemented through Polymorphic Table Functions
>> (PTF) as
>> > > proposed in FLIP-440[2]. Below is a code example[3] illustrating this
>> > > concept:
>> > >
>> > > ```
>> > >     public static class ScalarArgsFunction extends
>> > > TestProcessTableFunctionBase {
>> > >         public void eval(Integer i, Boolean b) {
>> > >             collectObjects(i, b);
>> > >         }
>> > >     }
>> > > ```
>> > >
>> > > ```
>> > > INSERT INTO sink SELECT * FROM f(i => 42, b => CAST('TRUE' AS
>> BOOLEAN))
>> > > ``
>> > >
>> > > So we can add a builtin function named `read_state_metadata` to read
>> > > savepoint data.
>> > >
>> > > Best,
>> > > Shengkai
>> > >
>> > > [1]
>> > >
>> > >
>> >
>> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL
>> > > [2]
>> > >
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093
>> > > [3]
>> > >
>> > >
>> >
>> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140
>> > >
>> > > Gyula Fóra <[email protected]> 于2025年3月19日周三 18:37写道：
>> > >
>> > > > Hi All!
>> > > >
>> > > > Thank you for the answers and concerns from everyone.
>> > > >
>> > > > On the CLI vs State Metadata Connector/Table question I would also
>> like
>> > > to
>> > > > step back a little and look at the bigger picture.
>> > > >
>> > > > I think the overall vision in Flink SQL is to provide a SQL native
>> > > > environment where we can serve complex use-cases like you would
>> expect
>> > > in a
>> > > > regular database.
>> > > > Most features, developments in the recent years have gone this way.
>> > > >
>> > > > The State Metadata Table would be a natural and straightforward fit
>> > here.
>> > > > So from my side, +1 for that.
>> > > >
>> > > > However I could understand if we are not ready to add a new
>> > > > connector/format due to maintenance concerns (and in general concern
>> > > about
>> > > > the design).
>> > > > If that's the issue then we should spend more time on the design to
>> get
>> > > > comfortable with the approach and seek feedback from the wider
>> > community
>> > > >
>> > > > I am -1 for the CLI/tooling approach as that will not provide the
>> > > > featureset we are looking for that is not already covered by the
>> Java
>> > > > connector. And that approach would come with the same maintenance
>> > > > implications.
>> > > >
>> > > > Cheers
>> > > > Gyula
>> > > >
>> > > >
>> > > > On Wed, Mar 19, 2025 at 11:24 AM Gabor Somogyi <
>> > > [email protected]>
>> > > > wrote:
>> > > >
>> > > > > Hi Zaklely, Shengkai
>> > > > >
>> > > > > Several topics are going on so adding gist answers to them. When
>> some
>> > > > topic
>> > > > > is not touched please highlight it.
>> > > > >
>> > > > > @Shengkai: I've read through all the previous FLIPs related
>> catalogs
>> > > and
>> > > > if
>> > > > > we would like to keep the concepts there
>> > > > > then one-to-one mapping relationship between savepoint and catalog
>> > is a
>> > > > > reasonable direction. In short I'm happy that
>> > > > > you've highlighted this and agree as a whole. I've written it down
>> > > > > previously, just want to double confirm that state catalog is
>> > > > > essential and planned. When we reach this point then your input is
>> > more
>> > > > > than welcome.
>> > > > >
>> > > > > @Zakelly: We've tried the CLI and separate library approaches with
>> > > users
>> > > > > already and these are not something which is welcome because of
>> the
>> > > > > following:
>> > > > > * Users want to have automated tasks and not manual CLI/library
>> > output
>> > > > > parsing. This can be hacked around but our experience is negative
>> on
>> > > this
>> > > > > because it's just brittle.
>> > > > > * From development perspective It's way much bigger effort than a
>> > > > connector
>> > > > > (hard to test, packaging/version handling is and extra layer of
>> > > > complexity,
>> > > > > external FS authentication is pain for users, expecting them to
>> > > download
>> > > > > savepoints also)
>> > > > > * Purely personal opinion but if we would find better ways later
>> then
>> > > > > retire a CLI is not more lightweight than retire a connector
>> > > > >
>> > > > > > It would be great if you give some examples on how user could
>> > > leverage
>> > > > > the separate connector to process the metadata.
>> > > > >
>> > > > > The most simplest cases:
>> > > > > * give me the overgroving state uids
>> > > > > * give me the not known (new or renamed) state uids
>> > > > > * give me the state uids where state size drastically dropped
>> compare
>> > > to
>> > > > a
>> > > > > previous savepoint (accidental state loss)
>> > > > >
>> > > > > Since it was mentioned: as a general offtopic teaser, yeah it
>> would
>> > be
>> > > > good
>> > > > > to have some sort of checkpoint/savepoint lineage or however we
>> call
>> > > it.
>> > > > > Since we've not yet reached this point there are no technical
>> > details,
>> > > > it's
>> > > > > more like a vision. It's a common pattern that
>> > > > > jobs are physically running but somehow the state processing is
>> stuck
>> > > and
>> > > > > it would be good to add some way to find it out automatically.
>> > > > > The important saying here is automation and not manual evaluation
>> > since
>> > > > > handling 10k+ jobs is just not allowing that.
>> > > > >
>> > > > > BR,
>> > > > > G
>> > > > >
>> > > > >
>> > > > > On Wed, Mar 19, 2025 at 6:46 AM Shengkai Fang <[email protected]>
>> > > wrote:
>> > > > >
>> > > > > > Hi, All.
>> > > > > >
>> > > > > > About State Catalog, I want to share more thoughts about this.
>> > > > > >
>> > > > > > In the initial design concept, I understood that a savepoint
>> and a
>> > > > state
>> > > > > > catalog have a one-to-one mapping relationship. Each operator
>> > > > corresponds
>> > > > > > to a database, and the state of each operator is represented as
>> > > > > individual
>> > > > > > tables. The rationale behind this design is:
>> > > > > >
>> > > > > > *State Diversity*: An operator may involve multiple types of
>> > states.
>> > > > For
>> > > > > > example, in our VVR design, a "multi-join" operator uses keyed
>> > states
>> > > > for
>> > > > > > two input streams and a broadcast state for the third stream.
>> This
>> > > > makes
>> > > > > it
>> > > > > > challenging to represent all states of an operator within a
>> single
>> > > > table.
>> > > > > > *Scalability*: Internally, an operator might have multiple keyed
>> > > states
>> > > > > > (e.g., value state and list state). However, large list states
>> may
>> > > not
>> > > > > fit
>> > > > > > entirely in memory. To address this, we recommend implementing
>> each
>> > > > state
>> > > > > > as a separate table.
>> > > > > >
>> > > > > > To resolve the loosely coupled relationships between operator
>> > states,
>> > > > we
>> > > > > > propose embedding predefined views within the catalog. These
>> views
>> > > > > simplify
>> > > > > > user understanding of operator implementations and provide a
>> more
>> > > > > intuitive
>> > > > > > perspective. For instance, a join operator may have multiple
>> state
>> > > > > > implementations (depending on whether the join key includes
>> unique
>> > > > > > attributes), but users primarily care about the data associated
>> > with
>> > > a
>> > > > > > specific join key across input streams.
>> > > > > >
>> > > > > > Returning to the one-to-one mapping between savepoints and
>> > catalogs,
>> > > we
>> > > > > aim
>> > > > > > to manage multiple user state catalogs through a catalog store.
>> > When
>> > > a
>> > > > > user
>> > > > > > triggers a savepoint for a job on the platform:
>> > > > > >
>> > > > > > 1. The platform sends a REST request to the JobManager.
>> > > > > > 2. Simultaneously, it registers a new state catalog in the
>> catalog
>> > > > store,
>> > > > > > enabling immediate analysis of state data on the platform.
>> > > > > > 3. Deleting a savepoint would also trigger the removal of its
>> > > > associated
>> > > > > > catalog.
>> > > > > >
>> > > > > > This vision assumes that states are self-describing or that a
>> state
>> > > > > > metaservice is introduced to analyze savepoint structures.
>> > > > > >
>> > > > > > > How can users create logic to identify differences between
>> > multiple
>> > > > > > savepoints?
>> > > > > >
>> > > > > > Since savepoints and state catalogs are one-to-one mapped, users
>> > can
>> > > > > query
>> > > > > > metadata via their respective catalogs. For example:
>> > > > > >
>> > > > > > 1. `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>`
>> > > > provides
>> > > > > > operator-specific metadata (e.g., state size, type).
>> > > > > > 2. Comparing metadata tables (e.g., schema versions, state entry
>> > > > counts)
>> > > > > > across catalogs reveals structural or quantitative differences.
>> > > > > > 3. For deeper analysis, users could write SQL queries to compare
>> > > > specific
>> > > > > > state partitions or leverage the metaservice to track state
>> > evolution
>> > > > > > (e.g., added/removed operators, modified state configurations).
>> > > > > >
>> > > > > > If we plan to introduce a state catalog in the future, I would
>> lean
>> > > > > toward
>> > > > > > using metadata tables. If a utility tool can address the
>> challenges
>> > > we
>> > > > > > face, could we avoid introducing an additional connector?
>> > > > > >
>> > > > > > Best,
>> > > > > > Shengkai
>> > > > > >
>> > > > > > Gyula Fóra <[email protected]> 于2025年3月17日周一 20:25写道：
>> > > > > >
>> > > > > > > Hi All!
>> > > > > > >
>> > > > > > > Without going into too much detail here are my 2 cents
>> regarding
>> > > the
>> > > > > > > virtual column / catalog metadata / table (connector)
>> discussion
>> > > for
>> > > > > the
>> > > > > > > State metadata.
>> > > > > > >
>> > > > > > > State metadata such as the types of states, their properties,
>> > > names,
>> > > > > > sizes
>> > > > > > > etc are all valuable information that can be used to enrich
>> the
>> > > > > > > computations we do on state.
>> > > > > > > We can either analyze it standalone (such as discover
>> anomalies,
>> > > for
>> > > > > > large
>> > > > > > > jobs with many states), across multiple savepoints (discover
>> how
>> > > > state
>> > > > > > > changed over time) or by joining it with keyed or non-keyed
>> state
>> > > > data
>> > > > > to
>> > > > > > > serve more complex queries on the state.
>> > > > > > >
>> > > > > > > The only solution that seems to serve all these use-cases and
>> > > > > > requirements
>> > > > > > > in a straightforward and SQL canonical way is to simply expose
>> > the
>> > > > > state
>> > > > > > > metadata as a separate table. This is a metadata table but you
>> > can
>> > > > also
>> > > > > > > think of it as data table, it makes no practical difference
>> here.
>> > > > > > >
>> > > > > > > Once we have a catalog later, the catalog can offer this table
>> > out
>> > > of
>> > > > > the
>> > > > > > > box, the same way databases provide metadata tables. For this
>> to
>> > > work
>> > > > > > > however we need another, simpler connector that creates this
>> > table.
>> > > > > > >
>> > > > > > > +1 for state metadata as a separate connector/table, instead
>> of
>> > > > adding
>> > > > > > > virtual columns and adhoc catalog metadata that is hard to use
>> > in a
>> > > > > large
>> > > > > > > number of queries.
>> > > > > > >
>> > > > > > > Cheers,
>> > > > > > > Gyula
>> > > > > > >
>> > > > > > > On Mon, Mar 17, 2025 at 12:44 PM Gabor Somogyi <
>> > > > > > [email protected]>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > 1. State TTL for Value Columns
>> > > > > > > >
>> > > > > > > > > I’m planning on adding this, and we may collaborate on it
>> in
>> > > the
>> > > > > > > future.
>> > > > > > > >
>> > > > > > > > +1 on this, just ping me.
>> > > > > > > >
>> > > > > > > > 2. Metadata Table vs. Metadata Column
>> > > > > > > >
>> > > > > > > > After some code digging and POC all I can say that with
>> heavy
>> > > > effort
>> > > > > we
>> > > > > > > can
>> > > > > > > > maybe add such changes that we're able to show metadata of a
>> > > > > savepoint
>> > > > > > > from
>> > > > > > > > catalog.
>> > > > > > > > I'm not against that but from user perspective this has
>> limited
>> > > > > value,
>> > > > > > > let
>> > > > > > > > me explain why.
>> > > > > > > >
>> > > > > > > > From high level perspective I see the following which I see
>> > > > agreement
>> > > > > > on:
>> > > > > > > > * We should have a catalog which is representing one or more
>> > jobs
>> > > > > > > savepoint
>> > > > > > > > data set (future plan)
>> > > > > > > > * Savepoints should be able to be registered in the catalog
>> > which
>> > > > are
>> > > > > > > then
>> > > > > > > > databases (future plan)
>> > > > > > > > * There must be a possiblity to create tables from databases
>> > > where
>> > > > > > users
>> > > > > > > > can read state data (exists already)
>> > > > > > > >
>> > > > > > > > In terms of metadata, If I understand correctly then the
>> > > suggested
>> > > > > > > approach
>> > > > > > > > would be to access
>> > > > > > > > it from the catalog describe command, right? Adding that
>> info
>> > > when
>> > > > > > > specific
>> > > > > > > > database describe command
>> > > > > > > > is executed could be done.
>> > > > > > > >
>> > > > > > > > The question is for instance how can users create such a
>> logic
>> > > that
>> > > > > > tells
>> > > > > > > > them what is
>> > > > > > > > the difference between multiple savepoints?
>> > > > > > > > Just to give some examples:
>> > > > > > > > * per operator size changes between savepoints
>> > > > > > > > * show values from operator data where state size reaches a
>> > > > boundary
>> > > > > > > > * in general "find which checkpoint ruined things" is quite
>> > > common
>> > > > > > > pattern
>> > > > > > > > What I would like to highlight here is that from Flink
>> point of
>> > > > view
>> > > > > > the
>> > > > > > > > metadata can be
>> > > > > > > > considered as a static side output information but for users
>> > > these
>> > > > > > values
>> > > > > > > > are actual real data
>> > > > > > > > where logic is planned to build around.
>> > > > > > > >
>> > > > > > > > > The metadata is more like one-time information instead of
>> a
>> > > > > streaming
>> > > > > > > > data that changes all
>> > > > > > > > the time, so a single connector seems to be an overkill.
>> > > > > > > >
>> > > > > > > > State data is also static within a savepoint and that's the
>> > > reason
>> > > > > why
>> > > > > > > the
>> > > > > > > > state processor API is working in batch mode.
>> > > > > > > > When we handle multiple checkpoints in a streaming fashion
>> then
>> > > > this
>> > > > > > can
>> > > > > > > be
>> > > > > > > > viewed from another angle.
>> > > > > > > >
>> > > > > > > > We can come up with more lightweight solution other than a
>> new
>> > > > > > connector
>> > > > > > > > but enforcing users to parse the catalog
>> > > > > > > > describe command output in order to compare multiple
>> savepoints
>> > > > > doesn't
>> > > > > > > > sound smooth user experience.
>> > > > > > > > Honestly I've no other idea how exposing metadata as real
>> user
>> > > data
>> > > > > so
>> > > > > > > > waiting on other approaches.
>> > > > > > > >
>> > > > > > > > BR,
>> > > > > > > > G
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Thu, Mar 13, 2025 at 2:44 AM Shengkai Fang <
>> > [email protected]
>> > > >
>> > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Looking forward to hearing the good news!
>> > > > > > > > >
>> > > > > > > > > Best,
>> > > > > > > > > Shengkai
>> > > > > > > > >
>> > > > > > > > > Gabor Somogyi <[email protected]> 于2025年3月12日周三
>> > > 22:24写道：
>> > > > > > > > >
>> > > > > > > > > > Thanks for both the valuable input!
>> > > > > > > > > >
>> > > > > > > > > > Let me take a closer look at the suggestions, like the
>> > > Catalog
>> > > > > > > > > capabilities
>> > > > > > > > > > and possibility of embedding TypeInformation or
>> > > > > > > > > > StateDescriptor metadata directly into the raw state
>> > files...
>> > > > > > > > > >
>> > > > > > > > > > BR,
>> > > > > > > > > > G
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Mar 12, 2025 at 8:17 AM Shengkai Fang <
>> > > > [email protected]
>> > > > > >
>> > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Thanks for Zakelly's clarification.
>> > > > > > > > > > >
>> > > > > > > > > > > 1. State TTL for Value Columns
>> > > > > > > > > > >
>> > > > > > > > > > > +1 to delay the discussion about this.
>> > > > > > > > > > >
>> > > > > > > > > > > 2. Metadata Table vs. Metadata Column
>> > > > > > > > > > >
>> > > > > > > > > > > I’d like to share my perspective on the State Catalog
>> > > > proposal.
>> > > > > > > While
>> > > > > > > > > > > introducing this capability is beneficial, there is a
>> > > > blocker:
>> > > > > > the
>> > > > > > > > > > current
>> > > > > > > > > > > StateBackend architecture does not permit operators to
>> > > encode
>> > > > > > > > > > > TypeInformation into the state—it only preserves the
>> > > > > Serializer.
>> > > > > > > This
>> > > > > > > > > > > limitation creates an asymmetry, as operators alone
>> > retain
>> > > > > > > knowledge
>> > > > > > > > of
>> > > > > > > > > > the
>> > > > > > > > > > > data structure’s schema.
>> > > > > > > > > > >
>> > > > > > > > > > > To address this, I suggest allowing operators to embed
>> > > > > > > > TypeInformation
>> > > > > > > > > or
>> > > > > > > > > > > StateDescriptor metadata directly into the raw state
>> > files.
>> > > > > Such
>> > > > > > a
>> > > > > > > > > design
>> > > > > > > > > > > would enable the Catalog to:
>> > > > > > > > > > >
>> > > > > > > > > > > 1. Parse state files and programmatically derive the
>> > schema
>> > > > and
>> > > > > > > > > > structural
>> > > > > > > > > > > guarantees for each state.
>> > > > > > > > > > > 2. Leverage existing Flink Table utilities, such as
>> > > > > > > > > > > LegacyTypeInfoDataTypeConverter (in
>> > > > > > > > > org.apache.flink.table.types.utils),
>> > > > > > > > > > to
>> > > > > > > > > > > bridge TypeInformation and DataType conversions.
>> > > > > > > > > > >
>> > > > > > > > > > > If we can not store the TypeInformation or
>> > StateDescriptor
>> > > > into
>> > > > > > the
>> > > > > > > > raw
>> > > > > > > > > > > state files, I am +1 for this FLIP to use metadata
>> column
>> > > to
>> > > > > > > retrieve
>> > > > > > > > > > > information.
>> > > > > > > > > > >
>> > > > > > > > > > > Best,
>> > > > > > > > > > > Shengkai
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > Zakelly Lan <[email protected]> 于2025年3月12日周三
>> > 12:43写道：
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi Gabor and Shengkai,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks for sharing your thoughts! This is a long
>> > > discussion
>> > > > > and
>> > > > > > > > sorry
>> > > > > > > > > > for
>> > > > > > > > > > > > the late reply (I'm busy catching up with release
>> 2.0
>> > > these
>> > > > > > > days).
>> > > > > > > > > > > >
>> > > > > > > > > > > > 1. State TTL for Value Columns
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Let me first clarify your thoughts to ensure I
>> > understand
>> > > > > > > > correctly.
>> > > > > > > > > > > IIUC,
>> > > > > > > > > > > > there is no persistent configuration for state TTL
>> in
>> > the
>> > > > > > > > checkpoint.
>> > > > > > > > > > > While
>> > > > > > > > > > > > you can infer that TTL is enabled by reading the
>> > > > serializer,
>> > > > > > the
>> > > > > > > > > > > checkpoint
>> > > > > > > > > > > > itself only stores the last access time for each
>> value.
>> > > So
>> > > > > the
>> > > > > > > only
>> > > > > > > > > > thing
>> > > > > > > > > > > > we can show is the last access time for each value.
>> But
>> > > it
>> > > > is
>> > > > > > not
>> > > > > > > > > > > required
>> > > > > > > > > > > > for all state backends to store this, as they may
>> > > directly
>> > > > > > store
>> > > > > > > > the
>> > > > > > > > > > > > expired time. This will also increase the
>> difficulty of
>> > > > > > > > > implementation
>> > > > > > > > > > &
>> > > > > > > > > > > > maintenance.
>> > > > > > > > > > > >
>> > > > > > > > > > > > This once again reiterates the importance of unified
>> > > > metadata
>> > > > > > for
>> > > > > > > > > > > > checkpoints. I’m planning on adding this, and we may
>> > > > > > collaborate
>> > > > > > > on
>> > > > > > > > > it
>> > > > > > > > > > in
>> > > > > > > > > > > > the future.
>> > > > > > > > > > > >
>> > > > > > > > > > > > 2. Metadata Table vs. Metadata Column
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > I'm not in favor of adding a new connector for
>> > metadata.
>> > > > The
>> > > > > > > > metadata
>> > > > > > > > > > is
>> > > > > > > > > > > > more like one-time information instead of a
>> streaming
>> > > data
>> > > > > that
>> > > > > > > > > changes
>> > > > > > > > > > > all
>> > > > > > > > > > > > the time, so a single connector seems to be an
>> > overkill.
>> > > It
>> > > > > is
>> > > > > > > not
>> > > > > > > > > easy
>> > > > > > > > > > > to
>> > > > > > > > > > > > withdraw a connector if we have a better solution in
>> > > > future.
>> > > > > > I'm
>> > > > > > > > not
>> > > > > > > > > > > > familiar with current Catalog capabilities, and if
>> it
>> > > could
>> > > > > > > extract
>> > > > > > > > > and
>> > > > > > > > > > > > show some operator-level information from savepoint,
>> > that
>> > > > > would
>> > > > > > > be
>> > > > > > > > > > great.
>> > > > > > > > > > > >
>> > > > > > > > > > > > If the Catalog can't do that, I would consider the
>> > > current
>> > > > > FLIP
>> > > > > > > to
>> > > > > > > > > be a
>> > > > > > > > > > > > compromise solution.
>> > > > > > > > > > > >
>> > > > > > > > > > > > And if we have that unified metadata for
>> > > > checkpoint/savepoint
>> > > > > > in
>> > > > > > > > > > future,
>> > > > > > > > > > > we
>> > > > > > > > > > > > may directly register savepoint in catalog, and
>> create
>> > a
>> > > > > source
>> > > > > > > > > without
>> > > > > > > > > > > > specifying complex columns, as well as describe the
>> > > > savepoint
>> > > > > > > > catalog
>> > > > > > > > > > to
>> > > > > > > > > > > > get the metadata. That's a good solution in my mind.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Best,
>> > > > > > > > > > > > Zakelly
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Wed, Mar 12, 2025 at 10:35 AM Shengkai Fang <
>> > > > > > > [email protected]>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Hi Gabor,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > 2. Adding a new connector with
>> `savepoint-metadata`
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > I would argue against introducing a new connector
>> > type
>> > > > > named
>> > > > > > > > > > > > > savepoint-metadata, as the existing Catalog
>> mechanism
>> > > can
>> > > > > > > > > inherently
>> > > > > > > > > > > > > provide the necessary connector factory
>> capabilities.
>> > > > I’ve
>> > > > > > > > detailed
>> > > > > > > > > > > this
>> > > > > > > > > > > > > proposal in branch[1]. Please take a moment to
>> review
>> > > it.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > If we introduce a connector named
>> > `savepoint-metadata`,
>> > > > it
>> > > > > > > means
>> > > > > > > > > user
>> > > > > > > > > > > can
>> > > > > > > > > > > > > create a temporary table with connector
>> > > > > `savepoint-metadata`
>> > > > > > > and
>> > > > > > > > > the
>> > > > > > > > > > > > > connector needs to check whether table schema is
>> same
>> > > to
>> > > > > the
>> > > > > > > > schema
>> > > > > > > > > > we
>> > > > > > > > > > > > > proposed in the FLIP. On the other hand, it's not
>> > easy
>> > > > work
>> > > > > > for
>> > > > > > > > > > others
>> > > > > > > > > > > to
>> > > > > > > > > > > > > users a metadata table with same schema.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > [1]
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > Shengkai
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Gabor Somogyi <[email protected]>
>> > > 于2025年3月11日周二
>> > > > > > > 16:56写道：
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Hi Shengkai,
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 1. State TTL for Value Columns
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > From directional perspective I agree your idea
>> how
>> > it
>> > > > can
>> > > > > > be
>> > > > > > > > > > > > implemented.
>> > > > > > > > > > > > > > Previously I've mentioned that TTL information
>> is
>> > not
>> > > > > > exposed
>> > > > > > > > on
>> > > > > > > > > > the
>> > > > > > > > > > > > > state
>> > > > > > > > > > > > > > processor API (which the SQL state connector
>> uses
>> > to
>> > > > read
>> > > > > > > data)
>> > > > > > > > > > > > > > and unless somebody show me the opposite this
>> FLIP
>> > is
>> > > > not
>> > > > > > > going
>> > > > > > > > > to
>> > > > > > > > > > > > > address
>> > > > > > > > > > > > > > this to avoid feature creep. Our users are also
>> > > > > interested
>> > > > > > in
>> > > > > > > > TTL
>> > > > > > > > > > so
>> > > > > > > > > > > > > > sooner or later we're going to expose it, this
>> is
>> > > > matter
>> > > > > of
>> > > > > > > > > > > scheduling.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 2. Adding a new connector with
>> > `savepoint-metadata`
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Not sure I understand your point at all related
>> > > > > > StateCatalog.
>> > > > > > > > > First
>> > > > > > > > > > > of
>> > > > > > > > > > > > > all
>> > > > > > > > > > > > > > I can't agree more that StateCatalog is needed
>> and
>> > > is a
>> > > > > > > planned
>> > > > > > > > > > > > building
>> > > > > > > > > > > > > > block in an upcoming
>> > > > > > > > > > > > > > FLIP but not sure how can it help now? No matter
>> > > what,
>> > > > > your
>> > > > > > > > > > knowledge
>> > > > > > > > > > > > is
>> > > > > > > > > > > > > > essential when we add StateCatalog. Let me
>> expose
>> > my
>> > > > > > > > > understanding
>> > > > > > > > > > in
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > area:
>> > > > > > > > > > > > > > * First we need create table statements to
>> access
>> > > state
>> > > > > > data
>> > > > > > > > and
>> > > > > > > > > > > > metadata
>> > > > > > > > > > > > > > * When we have that then we can add StateCatalog
>> > > which
>> > > > > > could
>> > > > > > > > > > > > potentially
>> > > > > > > > > > > > > > ease the life of users by for ex. giving
>> > > off-the-shelf
>> > > > > > tables
>> > > > > > > > > > without
>> > > > > > > > > > > > > > sweating with create table statements
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > User expectations:
>> > > > > > > > > > > > > > * See state data (this is fulfilled with the
>> > existing
>> > > > > > > > connector)
>> > > > > > > > > > > > > > * See metadata about state data like TTL (this
>> can
>> > be
>> > > > > added
>> > > > > > > as
>> > > > > > > > > > > metadata
>> > > > > > > > > > > > > > column as you suggested since it belongs to the
>> > data)
>> > > > > > > > > > > > > > * See metadata about operators (this can be
>> added
>> > > from
>> > > > > > > > > > > > > savepoint-metadata)
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Important to highlight that state data table
>> format
>> > > > > differs
>> > > > > > > > from
>> > > > > > > > > > > state
>> > > > > > > > > > > > > > metadata table format. Namely one table has rows
>> > for
>> > > > > state
>> > > > > > > > values
>> > > > > > > > > > and
>> > > > > > > > > > > > > > another has rows for operators, right?
>> > > > > > > > > > > > > > I think that's the reason why you've pinpointed
>> out
>> > > > that
>> > > > > > the
>> > > > > > > > > > > suggested
>> > > > > > > > > > > > > > metadata columns are somewhat clunky.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > As a conclusion I agree to add ${state-name}_ttl
>> > > > metadata
>> > > > > > > > column
>> > > > > > > > > > > later
>> > > > > > > > > > > > on
>> > > > > > > > > > > > > > since it belongs to the state value and adding a
>> > new
>> > > > > table
>> > > > > > > type
>> > > > > > > > > > (like
>> > > > > > > > > > > > you
>> > > > > > > > > > > > > > suggested similar to PG [1])
>> > > > > > > > > > > > > > for metadata. Please see how Spark does that too
>> > [2].
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > If you have better approach then please
>> elaborate
>> > > with
>> > > > > more
>> > > > > > > > > details
>> > > > > > > > > > > and
>> > > > > > > > > > > > > > help me to understand your point.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Up until now we've seen even in TB savepoints
>> > that
>> > > > the
>> > > > > > > number
>> > > > > > > > > of
>> > > > > > > > > > > keys
>> > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > be extremely huge but not the per key state
>> > itself.
>> > > > > > > > > > > > > > > But again, this is a good feature as-is and
>> can
>> > be
>> > > > > > handled
>> > > > > > > > in a
>> > > > > > > > > > > > > separate
>> > > > > > > > > > > > > > > jira.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I've just created
>> > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-37456.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > [1]
>> > > > > > > > https://www.postgresql.org/docs/current/view-pg-tables.html
>> > > > > > > > > > > > > > [2]
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > BR,
>> > > > > > > > > > > > > > G
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Tue, Mar 11, 2025 at 3:55 AM Shengkai Fang <
>> > > > > > > > [email protected]
>> > > > > > > > > >
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi, Gabor. Thanks for your response.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > 1. State TTL for Value Columns
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thank you for addressing the limitations here.
>> > > > > However, I
>> > > > > > > > > believe
>> > > > > > > > > > > it
>> > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > be beneficial to further clarify the API in
>> this
>> > > FLIP
>> > > > > > > > regarding
>> > > > > > > > > > how
>> > > > > > > > > > > > > users
>> > > > > > > > > > > > > > > can specify the TTL column.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > One potential approach that comes to mind is
>> > using
>> > > a
>> > > > > > > > > standardized
>> > > > > > > > > > > > > naming
>> > > > > > > > > > > > > > > convention such as ${state-name}_ttl for the
>> > > metadata
>> > > > > > > column
>> > > > > > > > > that
>> > > > > > > > > > > > > defines
>> > > > > > > > > > > > > > > the TTL value. In terms of implementation, the
>> > > > > > > > > > listReadableMetadata
>> > > > > > > > > > > > > > > function could:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 1. Read the table’s columns and configuration,
>> > > > > > > > > > > > > > > 2. Extract all defined state names, and
>> > > > > > > > > > > > > > > 3. Return a structured list of metadata
>> entries
>> > > > > formatted
>> > > > > > > as
>> > > > > > > > > > > > > > > ${state-name}_ttl.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > WDYT?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > 2. Adding a new connector with
>> > > `savepoint-metadata`
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Introducing a new connector type at this stage
>> > may
>> > > > > > > > > unnecessarily
>> > > > > > > > > > > > > > complicate
>> > > > > > > > > > > > > > > the system. Given that every table already
>> > belongs
>> > > > to a
>> > > > > > > > > Catalog,
>> > > > > > > > > > > > which
>> > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > designed to provide a Factory for building
>> source
>> > > or
>> > > > > sink
>> > > > > > > > > > > > connectors, I
>> > > > > > > > > > > > > > > propose integrating a dedicated StateCatalog
>> > > instead.
>> > > > > > This
>> > > > > > > > > > approach
>> > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > allow us to:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 1. Leverage the Catalog’s existing
>> capabilities
>> > to
>> > > > > manage
>> > > > > > > TTL
>> > > > > > > > > > > > metadata
>> > > > > > > > > > > > > > > (e.g., state names and TTL logic) without
>> > > duplicating
>> > > > > > > > > > > functionality.
>> > > > > > > > > > > > > > > 2. Provide a unified interface for connector
>> > > > > > instantiation
>> > > > > > > > and
>> > > > > > > > > > > > metadata
>> > > > > > > > > > > > > > > handling through the Catalog’s Factory
>> pattern.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Would this design decision better align with
>> our
>> > > > > > > > architecture’s
>> > > > > > > > > > > > > > > extensibility and reduce redundancy?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Up until now we've seen even in TB
>> savepoints
>> > > that
>> > > > > the
>> > > > > > > > number
>> > > > > > > > > > of
>> > > > > > > > > > > > keys
>> > > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > be extremely huge but not the per key state
>> > > itself.
>> > > > > > > > > > > > > > > > But again, this is a good feature as-is and
>> can
>> > > be
>> > > > > > > handled
>> > > > > > > > > in a
>> > > > > > > > > > > > > > separate
>> > > > > > > > > > > > > > > > jira.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > +1 for a separate jira.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > > Shengkai
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Gabor Somogyi <[email protected]>
>> > > > > 于2025年3月10日周一
>> > > > > > > > > 19:05写道：
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Hi Shengkai,
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Please see my comments inline.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > BR,
>> > > > > > > > > > > > > > > > G
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Mon, Mar 3, 2025 at 7:07 AM Shengkai
>> Fang <
>> > > > > > > > > > [email protected]>
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Hi, Gabor. Thanks for your the FLIP. I
>> have
>> > > some
>> > > > > > > > questions
>> > > > > > > > > > > about
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > FLIP:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 1. State TTL for Value Columns
>> > > > > > > > > > > > > > > > > How can users retrieve the state TTL
>> > > > (Time-to-Live)
>> > > > > > for
>> > > > > > > > > each
>> > > > > > > > > > > > value
>> > > > > > > > > > > > > > > > column?
>> > > > > > > > > > > > > > > > > From my understanding of the current
>> design,
>> > it
>> > > > > seems
>> > > > > > > > that
>> > > > > > > > > > this
>> > > > > > > > > > > > > > > > > functionality is not supported. Could you
>> > > clarify
>> > > > > if
>> > > > > > > > there
>> > > > > > > > > > are
>> > > > > > > > > > > > > plans
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > address this limitation?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Since the state processor API is not yet
>> > exposing
>> > > > > this
>> > > > > > > > > > > information
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > would require several steps.
>> > > > > > > > > > > > > > > > First, the state processor API support
>> needs to
>> > > be
>> > > > > > added
>> > > > > > > > > which
>> > > > > > > > > > > can
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > > > then
>> > > > > > > > > > > > > > > > exposed on the SQL API.
>> > > > > > > > > > > > > > > > This is definitely a future improvement
>> which
>> > is
>> > > > > useful
>> > > > > > > and
>> > > > > > > > > can
>> > > > > > > > > > > be
>> > > > > > > > > > > > > > > handled
>> > > > > > > > > > > > > > > > in a separate jira.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 2. Metadata Table vs. Metadata Column
>> > > > > > > > > > > > > > > > > The metadata information described in the
>> > FLIP
>> > > > > > appears
>> > > > > > > to
>> > > > > > > > > be
>> > > > > > > > > > > > > intended
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > describe the state files stored at a
>> specific
>> > > > > > location.
>> > > > > > > > To
>> > > > > > > > > > me,
>> > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > concept
>> > > > > > > > > > > > > > > > > aligns more closely with system tables
>> like
>> > > > > pg_tables
>> > > > > > > in
>> > > > > > > > > > > > PostgreSQL
>> > > > > > > > > > > > > > [1]
>> > > > > > > > > > > > > > > > or
>> > > > > > > > > > > > > > > > > the INFORMATION_SCHEMA in MySQL [2].
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Adding a new connector with
>> > `savepoint-metadata`
>> > > > is a
>> > > > > > > > > > possibility
>> > > > > > > > > > > > > where
>> > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > can create such functionality.
>> > > > > > > > > > > > > > > > I'm not against that, just want to have a
>> > common
>> > > > > > > agreement
>> > > > > > > > > that
>> > > > > > > > > > > we
>> > > > > > > > > > > > > > would
>> > > > > > > > > > > > > > > > like to move that direction.
>> > > > > > > > > > > > > > > > (As a side note not just PG but Spark also
>> has
>> > > > > similar
>> > > > > > > > > approach
>> > > > > > > > > > > > and I
>> > > > > > > > > > > > > > > > basically like the idea).
>> > > > > > > > > > > > > > > > If we would go that direction savepoint
>> > metadata
>> > > > can
>> > > > > be
>> > > > > > > > > reached
>> > > > > > > > > > > in
>> > > > > > > > > > > > a
>> > > > > > > > > > > > > > way
>> > > > > > > > > > > > > > > > that one row would represent
>> > > > > > > > > > > > > > > > an operator with it's values something like
>> > this:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│
>> > > > > > > > > > > > > > > > │ame      │id       │ash      │sm
>>  │elism
>> > > > > > > > > > > > > > > > │atesCount│orStateSi│tesSizeI│
>> > > > > > > > > > > > > > > > │         │         │         │         │
>> > >  │
>> > > > > > > > > > > > > > > >  │zeInBytes│nBytes  │
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
>> > > > > > > > > > > > > > > > │Source:  │datagen-s│47aee9439│2        │128
>> > > > │2
>> > > > > > > > > │16
>> > > > > > > > > > > > > > > >  │546     │
>> > > > > > > > > > > > > > > > │datagen-s│ource-uid│4d6ea26e2│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │ource    │         │d544bef0a│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │         │         │37bb5    │         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
>> > > > > > > > > > > > > > > > │long-udf-│long-udf-│6ed3f40bf│2        │128
>> > > > │2
>> > > > > > > > > │0
>> > > > > > > > > > > > > > │0
>> > > > > > > > > > > > > > > >      │
>> > > > > > > > > > > > > > > > │with-mast│with-mast│f3c8dfcdf│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │er-hook  │er-hook-u│cb95128a1│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │         │id       │018f1    │         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
>> > > > > > > > > > > > > > > > │value-pro│value-pro│ca4f5fe9a│2        │128
>> > > > │2
>> > > > > > > > > │0
>> > > > > > > > > > > > > > > > │40726   │
>> > > > > > > > > > > > > > > > │cess     │cess-uid │637b656f0│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │         │         │9ea78b3e7│         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > > │         │         │a15b9    │         │
>> > >  │
>> > > > > > > >  │
>> > > > > > > > > > > > >  │
>> > > > > > > > > > > > > > > >     │
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > This table can then be joined with the
>> actually
>> > > > > > existing
>> > > > > > > > > > > > `savepoint`
>> > > > > > > > > > > > > > > > connector created tables based on UID hash
>> > (which
>> > > > is
>> > > > > > > unique
>> > > > > > > > > and
>> > > > > > > > > > > > > always
>> > > > > > > > > > > > > > > > exists).
>> > > > > > > > > > > > > > > > This would mean that the already existing
>> table
>> > > > would
>> > > > > > > need
>> > > > > > > > > > only a
>> > > > > > > > > > > > > > single
>> > > > > > > > > > > > > > > > metadata column which is the UID hash.
>> > > > > > > > > > > > > > > > WDYT?
>> > > > > > > > > > > > > > > > @zakelly, plz share your thoughts too.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If we opt to use metadata columns, every
>> > record
>> > > > in
>> > > > > > the
>> > > > > > > > > table
>> > > > > > > > > > > > would
>> > > > > > > > > > > > > > end
>> > > > > > > > > > > > > > > up
>> > > > > > > > > > > > > > > > > having identical values for these columns
>> > > (please
>> > > > > > > correct
>> > > > > > > > > me
>> > > > > > > > > > if
>> > > > > > > > > > > > I’m
>> > > > > > > > > > > > > > > > > mistaken). On the other hand, the state
>> > > connector
>> > > > > > > > requires
>> > > > > > > > > > > users
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > specify
>> > > > > > > > > > > > > > > > > an operator UID or operator UID hash,
>> after
>> > > which
>> > > > > it
>> > > > > > > > > outputs
>> > > > > > > > > > > > > > > user-defined
>> > > > > > > > > > > > > > > > > values in its records. This approach feels
>> > > > somewhat
>> > > > > > > > > redundant
>> > > > > > > > > > > to
>> > > > > > > > > > > > > me.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > If we would add a new `savepoint-metadata`
>> > > > connector
>> > > > > > then
>> > > > > > > > > this
>> > > > > > > > > > > can
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > addressed.
>> > > > > > > > > > > > > > > > On the other hand UID and UID hash are
>> having
>> > > > > either-or
>> > > > > > > > > > > > relationship
>> > > > > > > > > > > > > > from
>> > > > > > > > > > > > > > > > config perspective,
>> > > > > > > > > > > > > > > > so when a user provides the UID then he/she
>> can
>> > > be
>> > > > > > > > interested
>> > > > > > > > > > in
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > hash
>> > > > > > > > > > > > > > > > for further calculations
>> > > > > > > > > > > > > > > > (the whole Flink internals are depending on
>> the
>> > > > > hash).
>> > > > > > > > > Printing
>> > > > > > > > > > > out
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > human readable UID
>> > > > > > > > > > > > > > > > is an explicit requirement from the user
>> side
>> > > > because
>> > > > > > > > hashes
>> > > > > > > > > > are
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > > human
>> > > > > > > > > > > > > > > > readable.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 3. Handling LIST and MAP States in the
>> State
>> > > > > > Connector
>> > > > > > > > > > > > > > > > > I have concerns about how the current
>> design
>> > > > > handles
>> > > > > > > LIST
>> > > > > > > > > and
>> > > > > > > > > > > MAP
>> > > > > > > > > > > > > > > states.
>> > > > > > > > > > > > > > > > > Specifically, the state connector uses
>> Flink
>> > > > SQL’s
>> > > > > > MAP
>> > > > > > > > and
>> > > > > > > > > > > ARRAY
>> > > > > > > > > > > > > > types,
>> > > > > > > > > > > > > > > > > which implies that it attempts to load
>> entire
>> > > MAP
>> > > > > or
>> > > > > > > LIST
>> > > > > > > > > > > states
>> > > > > > > > > > > > > into
>> > > > > > > > > > > > > > > > > memory.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > However, in many real-world scenarios,
>> these
>> > > > states
>> > > > > > can
>> > > > > > > > > grow
>> > > > > > > > > > > very
>> > > > > > > > > > > > > > > large.
>> > > > > > > > > > > > > > > > > Typically, the state API addresses this by
>> > > > > providing
>> > > > > > an
>> > > > > > > > > > > iterator
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > traverse elements within the state
>> > > incrementally.
>> > > > > I’m
>> > > > > > > > > unsure
>> > > > > > > > > > > > > whether
>> > > > > > > > > > > > > > > I’ve
>> > > > > > > > > > > > > > > > > missed something in FLIP-496 or FLIP-512,
>> but
>> > > it
>> > > > > > seems
>> > > > > > > > that
>> > > > > > > > > > the
>> > > > > > > > > > > > > > current
>> > > > > > > > > > > > > > > > > design might struggle with scalability in
>> > such
>> > > > > cases.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > You see it good, the current implementation
>> > keeps
>> > > > > state
>> > > > > > > > for a
>> > > > > > > > > > > > single
>> > > > > > > > > > > > > > key
>> > > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > memory.
>> > > > > > > > > > > > > > > > Back in the days we've considered this
>> > potential
>> > > > > issue
>> > > > > > > and
>> > > > > > > > > > > > concluded
>> > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > this is not necessarily
>> > > > > > > > > > > > > > > > needed for the initial version and can be
>> done
>> > > as a
>> > > > > > later
>> > > > > > > > > > > > > improvement.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Up until now we've seen even in TB
>> savepoints
>> > > that
>> > > > > the
>> > > > > > > > number
>> > > > > > > > > > of
>> > > > > > > > > > > > keys
>> > > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > be extremely huge but not the per key state
>> > > itself.
>> > > > > > > > > > > > > > > > But again, this is a good feature as-is and
>> can
>> > > be
>> > > > > > > handled
>> > > > > > > > > in a
>> > > > > > > > > > > > > > separate
>> > > > > > > > > > > > > > > > jira.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > > > > Shengkai
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > [1]
>> > > > > > > > > > >
>> > > https://www.postgresql.org/docs/current/view-pg-tables.html
>> > > > > > > > > > > > > > > > > [2]
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Gabor Somogyi <[email protected]>
>> > > > > > 于2025年3月3日周一
>> > > > > > > > > > > 02:00写道：
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Hi Zakelly,
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > In order to shoot for simplicity
>> `METADATA
>> > > > > VIRTUAL`
>> > > > > > > as
>> > > > > > > > > key
>> > > > > > > > > > > > words
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > definition is the target.
>> > > > > > > > > > > > > > > > > > When it's not super complex the latter
>> can
>> > be
>> > > > > added
>> > > > > > > > too.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > BR,
>> > > > > > > > > > > > > > > > > > G
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Sun, Mar 2, 2025 at 3:37 PM Zakelly
>> Lan
>> > <
>> > > > > > > > > > > > > [email protected]>
>> > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Hi Gabor,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > +1 for this.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Will the metadata column use `METADATA
>> > > > VIRTUAL`
>> > > > > > as
>> > > > > > > > key
>> > > > > > > > > > > words
>> > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > > definition, or `METADATA FROM xxx
>> > VIRTUAL`
>> > > > for
>> > > > > > > > > renaming,
>> > > > > > > > > > > just
>> > > > > > > > > > > > > > like
>> > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > Kafka table?
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Best,
>> > > > > > > > > > > > > > > > > > > Zakelly
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > On Sat, Mar 1, 2025 at 1:31 PM Gabor
>> > > Somogyi
>> > > > <
>> > > > > > > > > > > > > > > > > [email protected]>
>> > > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Hi All,
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > I'd like to start a discussion of
>> > > FLIP-512:
>> > > > > Add
>> > > > > > > > meta
>> > > > > > > > > > > > > > information
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > SQL
>> > > > > > > > > > > > > > > > > > > > state connector [1].
>> > > > > > > > > > > > > > > > > > > > Feel free to add your thoughts to
>> make
>> > > this
>> > > > > > > feature
>> > > > > > > > > > > better.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > [1]
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > BR,
>> > > > > > > > > > > > > > > > > > > > G
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Re: [DISCUSS] FLIP-512: Add meta information to SQL state connector

Reply via email to