Hi everyone, I'm fine with a seperate SQL connector for metadata, so maybe we could update the FLIP about our discussion? And Shengkai provides a PTF implementation, does that also meet the requirement?
Best, Zakelly On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi <gabor.g.somo...@gmail.com> wrote: > Hi All, > > @Zakelly: Gyula summarised it correctly what I meant so please treat the > content as mine. > As an addition I'm not against to add CLI at all, I'm just stating that in > some cases like this, users would like to have > a self-serving solution where they can provide SQL statements which can > trigger alerts automatically. > > My personal opinion is that CLI would be beneficial for several cases. A > good example is when users want to restart job > from specific Kafka offsets which are persisted in a savepoint. For such > scenario users are more than happy since they > expect manual intervention with full control. So all in all one can count > on my +1 when CLI FLIP would come up... > > BR, > G > > > On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra <gyula.f...@gmail.com> wrote: > >> Hi! >> >> @Zakelly Lan <zakelly....@gmail.com> >> I think what Gabor means is that users want to have predefined SQL scripts >> to perform state analysis tasks to debug/identify problems. >> Such as write a SQL script that joins the metadata table with the state >> and >> do some analytics on it. >> >> If we have a meta table then the SQL script that can do this is fixed and >> users can trigger this on demand by simply providing a new savepoint path. >> >> If we have a different mechanism to extract metadata that is not SQL >> native >> then manual steps need to be executed and a custom SQL script would need >> to >> be written that adds the manually extracted metadata into the script. >> >> Cheers, >> Gyula >> >> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan <zakelly....@gmail.com> >> wrote: >> >> > Hi all, >> > >> > Thanks for your answers! Getting everyone aligned on this topic is >> > challenging, but it’s definitely worth the effort since it will help >> > streamline things moving forward. >> > >> > @Gabor are you saying that users are using some scripts to define the >> SQL >> > metadata connector and get the information, right? If so, would a CLI >> tool >> > be more convenient? It's easy to invoke and can get the result swiftly. >> And >> > there should be some other systems to track the checkpoint lineage and >> > analyze if there are outliers in metadata (e.g. state size of one >> operator) >> > right? Well, maybe I missed something so please correct me if I'm wrong. >> > >> > I think the overall vision in Flink SQL is to provide a SQL native >> > > environment where we can serve complex use-cases like you would expect >> > in a >> > > regular database. >> > >> > >> > @Gyula Well, this is a good point. From the perspective of comprehensive >> > SQL experience, I'd +1 for treating metadata as data. Although I doubt >> if >> > there is a need for processing metadata, I won't be against a separate >> > connector. >> > >> > Regarding the CLI tool, I still think it’s worth implementing. Such a >> tool >> > could provide savepoint information before resuming from a savepoint, >> which >> > would enhance the user experience in CLI-based workflows. It would be >> good >> > if someone could implement this feature. We shouldn’t worry about >> whether >> > this tool might be retired in the future. Regardless of the SQL-based >> > solution we eventually adopt, this capability will remain essential for >> CLI >> > users. This is another topic. >> > >> > >> > Best, >> > Zakelly >> > >> > >> > On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang <fskm...@gmail.com> >> wrote: >> > >> > > Hi. >> > > >> > > After reading the doc[1], I think Spark provides a function for users >> to >> > > consume the metadata from the savepoint. In Flink SQL, similar >> > > functionality is implemented through Polymorphic Table Functions >> (PTF) as >> > > proposed in FLIP-440[2]. Below is a code example[3] illustrating this >> > > concept: >> > > >> > > ``` >> > > public static class ScalarArgsFunction extends >> > > TestProcessTableFunctionBase { >> > > public void eval(Integer i, Boolean b) { >> > > collectObjects(i, b); >> > > } >> > > } >> > > ``` >> > > >> > > ``` >> > > INSERT INTO sink SELECT * FROM f(i => 42, b => CAST('TRUE' AS >> BOOLEAN)) >> > > `` >> > > >> > > So we can add a builtin function named `read_state_metadata` to read >> > > savepoint data. >> > > >> > > Best, >> > > Shengkai >> > > >> > > [1] >> > > >> > > >> > >> https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL >> > > [2] >> > > >> > >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093 >> > > [3] >> > > >> > > >> > >> https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140 >> > > >> > > Gyula Fóra <gyula.f...@gmail.com> 于2025年3月19日周三 18:37写道: >> > > >> > > > Hi All! >> > > > >> > > > Thank you for the answers and concerns from everyone. >> > > > >> > > > On the CLI vs State Metadata Connector/Table question I would also >> like >> > > to >> > > > step back a little and look at the bigger picture. >> > > > >> > > > I think the overall vision in Flink SQL is to provide a SQL native >> > > > environment where we can serve complex use-cases like you would >> expect >> > > in a >> > > > regular database. >> > > > Most features, developments in the recent years have gone this way. >> > > > >> > > > The State Metadata Table would be a natural and straightforward fit >> > here. >> > > > So from my side, +1 for that. >> > > > >> > > > However I could understand if we are not ready to add a new >> > > > connector/format due to maintenance concerns (and in general concern >> > > about >> > > > the design). >> > > > If that's the issue then we should spend more time on the design to >> get >> > > > comfortable with the approach and seek feedback from the wider >> > community >> > > > >> > > > I am -1 for the CLI/tooling approach as that will not provide the >> > > > featureset we are looking for that is not already covered by the >> Java >> > > > connector. And that approach would come with the same maintenance >> > > > implications. >> > > > >> > > > Cheers >> > > > Gyula >> > > > >> > > > >> > > > On Wed, Mar 19, 2025 at 11:24 AM Gabor Somogyi < >> > > gabor.g.somo...@gmail.com> >> > > > wrote: >> > > > >> > > > > Hi Zaklely, Shengkai >> > > > > >> > > > > Several topics are going on so adding gist answers to them. When >> some >> > > > topic >> > > > > is not touched please highlight it. >> > > > > >> > > > > @Shengkai: I've read through all the previous FLIPs related >> catalogs >> > > and >> > > > if >> > > > > we would like to keep the concepts there >> > > > > then one-to-one mapping relationship between savepoint and catalog >> > is a >> > > > > reasonable direction. In short I'm happy that >> > > > > you've highlighted this and agree as a whole. I've written it down >> > > > > previously, just want to double confirm that state catalog is >> > > > > essential and planned. When we reach this point then your input is >> > more >> > > > > than welcome. >> > > > > >> > > > > @Zakelly: We've tried the CLI and separate library approaches with >> > > users >> > > > > already and these are not something which is welcome because of >> the >> > > > > following: >> > > > > * Users want to have automated tasks and not manual CLI/library >> > output >> > > > > parsing. This can be hacked around but our experience is negative >> on >> > > this >> > > > > because it's just brittle. >> > > > > * From development perspective It's way much bigger effort than a >> > > > connector >> > > > > (hard to test, packaging/version handling is and extra layer of >> > > > complexity, >> > > > > external FS authentication is pain for users, expecting them to >> > > download >> > > > > savepoints also) >> > > > > * Purely personal opinion but if we would find better ways later >> then >> > > > > retire a CLI is not more lightweight than retire a connector >> > > > > >> > > > > > It would be great if you give some examples on how user could >> > > leverage >> > > > > the separate connector to process the metadata. >> > > > > >> > > > > The most simplest cases: >> > > > > * give me the overgroving state uids >> > > > > * give me the not known (new or renamed) state uids >> > > > > * give me the state uids where state size drastically dropped >> compare >> > > to >> > > > a >> > > > > previous savepoint (accidental state loss) >> > > > > >> > > > > Since it was mentioned: as a general offtopic teaser, yeah it >> would >> > be >> > > > good >> > > > > to have some sort of checkpoint/savepoint lineage or however we >> call >> > > it. >> > > > > Since we've not yet reached this point there are no technical >> > details, >> > > > it's >> > > > > more like a vision. It's a common pattern that >> > > > > jobs are physically running but somehow the state processing is >> stuck >> > > and >> > > > > it would be good to add some way to find it out automatically. >> > > > > The important saying here is automation and not manual evaluation >> > since >> > > > > handling 10k+ jobs is just not allowing that. >> > > > > >> > > > > BR, >> > > > > G >> > > > > >> > > > > >> > > > > On Wed, Mar 19, 2025 at 6:46 AM Shengkai Fang <fskm...@gmail.com> >> > > wrote: >> > > > > >> > > > > > Hi, All. >> > > > > > >> > > > > > About State Catalog, I want to share more thoughts about this. >> > > > > > >> > > > > > In the initial design concept, I understood that a savepoint >> and a >> > > > state >> > > > > > catalog have a one-to-one mapping relationship. Each operator >> > > > corresponds >> > > > > > to a database, and the state of each operator is represented as >> > > > > individual >> > > > > > tables. The rationale behind this design is: >> > > > > > >> > > > > > *State Diversity*: An operator may involve multiple types of >> > states. >> > > > For >> > > > > > example, in our VVR design, a "multi-join" operator uses keyed >> > states >> > > > for >> > > > > > two input streams and a broadcast state for the third stream. >> This >> > > > makes >> > > > > it >> > > > > > challenging to represent all states of an operator within a >> single >> > > > table. >> > > > > > *Scalability*: Internally, an operator might have multiple keyed >> > > states >> > > > > > (e.g., value state and list state). However, large list states >> may >> > > not >> > > > > fit >> > > > > > entirely in memory. To address this, we recommend implementing >> each >> > > > state >> > > > > > as a separate table. >> > > > > > >> > > > > > To resolve the loosely coupled relationships between operator >> > states, >> > > > we >> > > > > > propose embedding predefined views within the catalog. These >> views >> > > > > simplify >> > > > > > user understanding of operator implementations and provide a >> more >> > > > > intuitive >> > > > > > perspective. For instance, a join operator may have multiple >> state >> > > > > > implementations (depending on whether the join key includes >> unique >> > > > > > attributes), but users primarily care about the data associated >> > with >> > > a >> > > > > > specific join key across input streams. >> > > > > > >> > > > > > Returning to the one-to-one mapping between savepoints and >> > catalogs, >> > > we >> > > > > aim >> > > > > > to manage multiple user state catalogs through a catalog store. >> > When >> > > a >> > > > > user >> > > > > > triggers a savepoint for a job on the platform: >> > > > > > >> > > > > > 1. The platform sends a REST request to the JobManager. >> > > > > > 2. Simultaneously, it registers a new state catalog in the >> catalog >> > > > store, >> > > > > > enabling immediate analysis of state data on the platform. >> > > > > > 3. Deleting a savepoint would also trigger the removal of its >> > > > associated >> > > > > > catalog. >> > > > > > >> > > > > > This vision assumes that states are self-describing or that a >> state >> > > > > > metaservice is introduced to analyze savepoint structures. >> > > > > > >> > > > > > > How can users create logic to identify differences between >> > multiple >> > > > > > savepoints? >> > > > > > >> > > > > > Since savepoints and state catalogs are one-to-one mapped, users >> > can >> > > > > query >> > > > > > metadata via their respective catalogs. For example: >> > > > > > >> > > > > > 1. `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>` >> > > > provides >> > > > > > operator-specific metadata (e.g., state size, type). >> > > > > > 2. Comparing metadata tables (e.g., schema versions, state entry >> > > > counts) >> > > > > > across catalogs reveals structural or quantitative differences. >> > > > > > 3. For deeper analysis, users could write SQL queries to compare >> > > > specific >> > > > > > state partitions or leverage the metaservice to track state >> > evolution >> > > > > > (e.g., added/removed operators, modified state configurations). >> > > > > > >> > > > > > If we plan to introduce a state catalog in the future, I would >> lean >> > > > > toward >> > > > > > using metadata tables. If a utility tool can address the >> challenges >> > > we >> > > > > > face, could we avoid introducing an additional connector? >> > > > > > >> > > > > > Best, >> > > > > > Shengkai >> > > > > > >> > > > > > Gyula Fóra <gyula.f...@gmail.com> 于2025年3月17日周一 20:25写道: >> > > > > > >> > > > > > > Hi All! >> > > > > > > >> > > > > > > Without going into too much detail here are my 2 cents >> regarding >> > > the >> > > > > > > virtual column / catalog metadata / table (connector) >> discussion >> > > for >> > > > > the >> > > > > > > State metadata. >> > > > > > > >> > > > > > > State metadata such as the types of states, their properties, >> > > names, >> > > > > > sizes >> > > > > > > etc are all valuable information that can be used to enrich >> the >> > > > > > > computations we do on state. >> > > > > > > We can either analyze it standalone (such as discover >> anomalies, >> > > for >> > > > > > large >> > > > > > > jobs with many states), across multiple savepoints (discover >> how >> > > > state >> > > > > > > changed over time) or by joining it with keyed or non-keyed >> state >> > > > data >> > > > > to >> > > > > > > serve more complex queries on the state. >> > > > > > > >> > > > > > > The only solution that seems to serve all these use-cases and >> > > > > > requirements >> > > > > > > in a straightforward and SQL canonical way is to simply expose >> > the >> > > > > state >> > > > > > > metadata as a separate table. This is a metadata table but you >> > can >> > > > also >> > > > > > > think of it as data table, it makes no practical difference >> here. >> > > > > > > >> > > > > > > Once we have a catalog later, the catalog can offer this table >> > out >> > > of >> > > > > the >> > > > > > > box, the same way databases provide metadata tables. For this >> to >> > > work >> > > > > > > however we need another, simpler connector that creates this >> > table. >> > > > > > > >> > > > > > > +1 for state metadata as a separate connector/table, instead >> of >> > > > adding >> > > > > > > virtual columns and adhoc catalog metadata that is hard to use >> > in a >> > > > > large >> > > > > > > number of queries. >> > > > > > > >> > > > > > > Cheers, >> > > > > > > Gyula >> > > > > > > >> > > > > > > On Mon, Mar 17, 2025 at 12:44 PM Gabor Somogyi < >> > > > > > gabor.g.somo...@gmail.com> >> > > > > > > wrote: >> > > > > > > >> > > > > > > > 1. State TTL for Value Columns >> > > > > > > > >> > > > > > > > > I’m planning on adding this, and we may collaborate on it >> in >> > > the >> > > > > > > future. >> > > > > > > > >> > > > > > > > +1 on this, just ping me. >> > > > > > > > >> > > > > > > > 2. Metadata Table vs. Metadata Column >> > > > > > > > >> > > > > > > > After some code digging and POC all I can say that with >> heavy >> > > > effort >> > > > > we >> > > > > > > can >> > > > > > > > maybe add such changes that we're able to show metadata of a >> > > > > savepoint >> > > > > > > from >> > > > > > > > catalog. >> > > > > > > > I'm not against that but from user perspective this has >> limited >> > > > > value, >> > > > > > > let >> > > > > > > > me explain why. >> > > > > > > > >> > > > > > > > From high level perspective I see the following which I see >> > > > agreement >> > > > > > on: >> > > > > > > > * We should have a catalog which is representing one or more >> > jobs >> > > > > > > savepoint >> > > > > > > > data set (future plan) >> > > > > > > > * Savepoints should be able to be registered in the catalog >> > which >> > > > are >> > > > > > > then >> > > > > > > > databases (future plan) >> > > > > > > > * There must be a possiblity to create tables from databases >> > > where >> > > > > > users >> > > > > > > > can read state data (exists already) >> > > > > > > > >> > > > > > > > In terms of metadata, If I understand correctly then the >> > > suggested >> > > > > > > approach >> > > > > > > > would be to access >> > > > > > > > it from the catalog describe command, right? Adding that >> info >> > > when >> > > > > > > specific >> > > > > > > > database describe command >> > > > > > > > is executed could be done. >> > > > > > > > >> > > > > > > > The question is for instance how can users create such a >> logic >> > > that >> > > > > > tells >> > > > > > > > them what is >> > > > > > > > the difference between multiple savepoints? >> > > > > > > > Just to give some examples: >> > > > > > > > * per operator size changes between savepoints >> > > > > > > > * show values from operator data where state size reaches a >> > > > boundary >> > > > > > > > * in general "find which checkpoint ruined things" is quite >> > > common >> > > > > > > pattern >> > > > > > > > What I would like to highlight here is that from Flink >> point of >> > > > view >> > > > > > the >> > > > > > > > metadata can be >> > > > > > > > considered as a static side output information but for users >> > > these >> > > > > > values >> > > > > > > > are actual real data >> > > > > > > > where logic is planned to build around. >> > > > > > > > >> > > > > > > > > The metadata is more like one-time information instead of >> a >> > > > > streaming >> > > > > > > > data that changes all >> > > > > > > > the time, so a single connector seems to be an overkill. >> > > > > > > > >> > > > > > > > State data is also static within a savepoint and that's the >> > > reason >> > > > > why >> > > > > > > the >> > > > > > > > state processor API is working in batch mode. >> > > > > > > > When we handle multiple checkpoints in a streaming fashion >> then >> > > > this >> > > > > > can >> > > > > > > be >> > > > > > > > viewed from another angle. >> > > > > > > > >> > > > > > > > We can come up with more lightweight solution other than a >> new >> > > > > > connector >> > > > > > > > but enforcing users to parse the catalog >> > > > > > > > describe command output in order to compare multiple >> savepoints >> > > > > doesn't >> > > > > > > > sound smooth user experience. >> > > > > > > > Honestly I've no other idea how exposing metadata as real >> user >> > > data >> > > > > so >> > > > > > > > waiting on other approaches. >> > > > > > > > >> > > > > > > > BR, >> > > > > > > > G >> > > > > > > > >> > > > > > > > >> > > > > > > > On Thu, Mar 13, 2025 at 2:44 AM Shengkai Fang < >> > fskm...@gmail.com >> > > > >> > > > > > wrote: >> > > > > > > > >> > > > > > > > > Looking forward to hearing the good news! >> > > > > > > > > >> > > > > > > > > Best, >> > > > > > > > > Shengkai >> > > > > > > > > >> > > > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月12日周三 >> > > 22:24写道: >> > > > > > > > > >> > > > > > > > > > Thanks for both the valuable input! >> > > > > > > > > > >> > > > > > > > > > Let me take a closer look at the suggestions, like the >> > > Catalog >> > > > > > > > > capabilities >> > > > > > > > > > and possibility of embedding TypeInformation or >> > > > > > > > > > StateDescriptor metadata directly into the raw state >> > files... >> > > > > > > > > > >> > > > > > > > > > BR, >> > > > > > > > > > G >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > On Wed, Mar 12, 2025 at 8:17 AM Shengkai Fang < >> > > > fskm...@gmail.com >> > > > > > >> > > > > > > > wrote: >> > > > > > > > > > >> > > > > > > > > > > Thanks for Zakelly's clarification. >> > > > > > > > > > > >> > > > > > > > > > > 1. State TTL for Value Columns >> > > > > > > > > > > >> > > > > > > > > > > +1 to delay the discussion about this. >> > > > > > > > > > > >> > > > > > > > > > > 2. Metadata Table vs. Metadata Column >> > > > > > > > > > > >> > > > > > > > > > > I’d like to share my perspective on the State Catalog >> > > > proposal. >> > > > > > > While >> > > > > > > > > > > introducing this capability is beneficial, there is a >> > > > blocker: >> > > > > > the >> > > > > > > > > > current >> > > > > > > > > > > StateBackend architecture does not permit operators to >> > > encode >> > > > > > > > > > > TypeInformation into the state—it only preserves the >> > > > > Serializer. >> > > > > > > This >> > > > > > > > > > > limitation creates an asymmetry, as operators alone >> > retain >> > > > > > > knowledge >> > > > > > > > of >> > > > > > > > > > the >> > > > > > > > > > > data structure’s schema. >> > > > > > > > > > > >> > > > > > > > > > > To address this, I suggest allowing operators to embed >> > > > > > > > TypeInformation >> > > > > > > > > or >> > > > > > > > > > > StateDescriptor metadata directly into the raw state >> > files. >> > > > > Such >> > > > > > a >> > > > > > > > > design >> > > > > > > > > > > would enable the Catalog to: >> > > > > > > > > > > >> > > > > > > > > > > 1. Parse state files and programmatically derive the >> > schema >> > > > and >> > > > > > > > > > structural >> > > > > > > > > > > guarantees for each state. >> > > > > > > > > > > 2. Leverage existing Flink Table utilities, such as >> > > > > > > > > > > LegacyTypeInfoDataTypeConverter (in >> > > > > > > > > org.apache.flink.table.types.utils), >> > > > > > > > > > to >> > > > > > > > > > > bridge TypeInformation and DataType conversions. >> > > > > > > > > > > >> > > > > > > > > > > If we can not store the TypeInformation or >> > StateDescriptor >> > > > into >> > > > > > the >> > > > > > > > raw >> > > > > > > > > > > state files, I am +1 for this FLIP to use metadata >> column >> > > to >> > > > > > > retrieve >> > > > > > > > > > > information. >> > > > > > > > > > > >> > > > > > > > > > > Best, >> > > > > > > > > > > Shengkai >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > Zakelly Lan <zakelly....@gmail.com> 于2025年3月12日周三 >> > 12:43写道: >> > > > > > > > > > > >> > > > > > > > > > > > Hi Gabor and Shengkai, >> > > > > > > > > > > > >> > > > > > > > > > > > Thanks for sharing your thoughts! This is a long >> > > discussion >> > > > > and >> > > > > > > > sorry >> > > > > > > > > > for >> > > > > > > > > > > > the late reply (I'm busy catching up with release >> 2.0 >> > > these >> > > > > > > days). >> > > > > > > > > > > > >> > > > > > > > > > > > 1. State TTL for Value Columns >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Let me first clarify your thoughts to ensure I >> > understand >> > > > > > > > correctly. >> > > > > > > > > > > IIUC, >> > > > > > > > > > > > there is no persistent configuration for state TTL >> in >> > the >> > > > > > > > checkpoint. >> > > > > > > > > > > While >> > > > > > > > > > > > you can infer that TTL is enabled by reading the >> > > > serializer, >> > > > > > the >> > > > > > > > > > > checkpoint >> > > > > > > > > > > > itself only stores the last access time for each >> value. >> > > So >> > > > > the >> > > > > > > only >> > > > > > > > > > thing >> > > > > > > > > > > > we can show is the last access time for each value. >> But >> > > it >> > > > is >> > > > > > not >> > > > > > > > > > > required >> > > > > > > > > > > > for all state backends to store this, as they may >> > > directly >> > > > > > store >> > > > > > > > the >> > > > > > > > > > > > expired time. This will also increase the >> difficulty of >> > > > > > > > > implementation >> > > > > > > > > > & >> > > > > > > > > > > > maintenance. >> > > > > > > > > > > > >> > > > > > > > > > > > This once again reiterates the importance of unified >> > > > metadata >> > > > > > for >> > > > > > > > > > > > checkpoints. I’m planning on adding this, and we may >> > > > > > collaborate >> > > > > > > on >> > > > > > > > > it >> > > > > > > > > > in >> > > > > > > > > > > > the future. >> > > > > > > > > > > > >> > > > > > > > > > > > 2. Metadata Table vs. Metadata Column >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > I'm not in favor of adding a new connector for >> > metadata. >> > > > The >> > > > > > > > metadata >> > > > > > > > > > is >> > > > > > > > > > > > more like one-time information instead of a >> streaming >> > > data >> > > > > that >> > > > > > > > > changes >> > > > > > > > > > > all >> > > > > > > > > > > > the time, so a single connector seems to be an >> > overkill. >> > > It >> > > > > is >> > > > > > > not >> > > > > > > > > easy >> > > > > > > > > > > to >> > > > > > > > > > > > withdraw a connector if we have a better solution in >> > > > future. >> > > > > > I'm >> > > > > > > > not >> > > > > > > > > > > > familiar with current Catalog capabilities, and if >> it >> > > could >> > > > > > > extract >> > > > > > > > > and >> > > > > > > > > > > > show some operator-level information from savepoint, >> > that >> > > > > would >> > > > > > > be >> > > > > > > > > > great. >> > > > > > > > > > > > >> > > > > > > > > > > > If the Catalog can't do that, I would consider the >> > > current >> > > > > FLIP >> > > > > > > to >> > > > > > > > > be a >> > > > > > > > > > > > compromise solution. >> > > > > > > > > > > > >> > > > > > > > > > > > And if we have that unified metadata for >> > > > checkpoint/savepoint >> > > > > > in >> > > > > > > > > > future, >> > > > > > > > > > > we >> > > > > > > > > > > > may directly register savepoint in catalog, and >> create >> > a >> > > > > source >> > > > > > > > > without >> > > > > > > > > > > > specifying complex columns, as well as describe the >> > > > savepoint >> > > > > > > > catalog >> > > > > > > > > > to >> > > > > > > > > > > > get the metadata. That's a good solution in my mind. >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > > Best, >> > > > > > > > > > > > Zakelly >> > > > > > > > > > > > >> > > > > > > > > > > > On Wed, Mar 12, 2025 at 10:35 AM Shengkai Fang < >> > > > > > > fskm...@gmail.com> >> > > > > > > > > > > wrote: >> > > > > > > > > > > > >> > > > > > > > > > > > > Hi Gabor, >> > > > > > > > > > > > > >> > > > > > > > > > > > > > 2. Adding a new connector with >> `savepoint-metadata` >> > > > > > > > > > > > > >> > > > > > > > > > > > > I would argue against introducing a new connector >> > type >> > > > > named >> > > > > > > > > > > > > savepoint-metadata, as the existing Catalog >> mechanism >> > > can >> > > > > > > > > inherently >> > > > > > > > > > > > > provide the necessary connector factory >> capabilities. >> > > > I’ve >> > > > > > > > detailed >> > > > > > > > > > > this >> > > > > > > > > > > > > proposal in branch[1]. Please take a moment to >> review >> > > it. >> > > > > > > > > > > > > >> > > > > > > > > > > > > If we introduce a connector named >> > `savepoint-metadata`, >> > > > it >> > > > > > > means >> > > > > > > > > user >> > > > > > > > > > > can >> > > > > > > > > > > > > create a temporary table with connector >> > > > > `savepoint-metadata` >> > > > > > > and >> > > > > > > > > the >> > > > > > > > > > > > > connector needs to check whether table schema is >> same >> > > to >> > > > > the >> > > > > > > > schema >> > > > > > > > > > we >> > > > > > > > > > > > > proposed in the FLIP. On the other hand, it's not >> > easy >> > > > work >> > > > > > for >> > > > > > > > > > others >> > > > > > > > > > > to >> > > > > > > > > > > > > users a metadata table with same schema. >> > > > > > > > > > > > > >> > > > > > > > > > > > > [1] >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63 >> > > > > > > > > > > > > >> > > > > > > > > > > > > Best, >> > > > > > > > > > > > > Shengkai >> > > > > > > > > > > > > >> > > > > > > > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> >> > > 于2025年3月11日周二 >> > > > > > > 16:56写道: >> > > > > > > > > > > > > >> > > > > > > > > > > > > > Hi Shengkai, >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > 1. State TTL for Value Columns >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > From directional perspective I agree your idea >> how >> > it >> > > > can >> > > > > > be >> > > > > > > > > > > > implemented. >> > > > > > > > > > > > > > Previously I've mentioned that TTL information >> is >> > not >> > > > > > exposed >> > > > > > > > on >> > > > > > > > > > the >> > > > > > > > > > > > > state >> > > > > > > > > > > > > > processor API (which the SQL state connector >> uses >> > to >> > > > read >> > > > > > > data) >> > > > > > > > > > > > > > and unless somebody show me the opposite this >> FLIP >> > is >> > > > not >> > > > > > > going >> > > > > > > > > to >> > > > > > > > > > > > > address >> > > > > > > > > > > > > > this to avoid feature creep. Our users are also >> > > > > interested >> > > > > > in >> > > > > > > > TTL >> > > > > > > > > > so >> > > > > > > > > > > > > > sooner or later we're going to expose it, this >> is >> > > > matter >> > > > > of >> > > > > > > > > > > scheduling. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > 2. Adding a new connector with >> > `savepoint-metadata` >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Not sure I understand your point at all related >> > > > > > StateCatalog. >> > > > > > > > > First >> > > > > > > > > > > of >> > > > > > > > > > > > > all >> > > > > > > > > > > > > > I can't agree more that StateCatalog is needed >> and >> > > is a >> > > > > > > planned >> > > > > > > > > > > > building >> > > > > > > > > > > > > > block in an upcoming >> > > > > > > > > > > > > > FLIP but not sure how can it help now? No matter >> > > what, >> > > > > your >> > > > > > > > > > knowledge >> > > > > > > > > > > > is >> > > > > > > > > > > > > > essential when we add StateCatalog. Let me >> expose >> > my >> > > > > > > > > understanding >> > > > > > > > > > in >> > > > > > > > > > > > > this >> > > > > > > > > > > > > > area: >> > > > > > > > > > > > > > * First we need create table statements to >> access >> > > state >> > > > > > data >> > > > > > > > and >> > > > > > > > > > > > metadata >> > > > > > > > > > > > > > * When we have that then we can add StateCatalog >> > > which >> > > > > > could >> > > > > > > > > > > > potentially >> > > > > > > > > > > > > > ease the life of users by for ex. giving >> > > off-the-shelf >> > > > > > tables >> > > > > > > > > > without >> > > > > > > > > > > > > > sweating with create table statements >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > User expectations: >> > > > > > > > > > > > > > * See state data (this is fulfilled with the >> > existing >> > > > > > > > connector) >> > > > > > > > > > > > > > * See metadata about state data like TTL (this >> can >> > be >> > > > > added >> > > > > > > as >> > > > > > > > > > > metadata >> > > > > > > > > > > > > > column as you suggested since it belongs to the >> > data) >> > > > > > > > > > > > > > * See metadata about operators (this can be >> added >> > > from >> > > > > > > > > > > > > savepoint-metadata) >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > Important to highlight that state data table >> format >> > > > > differs >> > > > > > > > from >> > > > > > > > > > > state >> > > > > > > > > > > > > > metadata table format. Namely one table has rows >> > for >> > > > > state >> > > > > > > > values >> > > > > > > > > > and >> > > > > > > > > > > > > > another has rows for operators, right? >> > > > > > > > > > > > > > I think that's the reason why you've pinpointed >> out >> > > > that >> > > > > > the >> > > > > > > > > > > suggested >> > > > > > > > > > > > > > metadata columns are somewhat clunky. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > As a conclusion I agree to add ${state-name}_ttl >> > > > metadata >> > > > > > > > column >> > > > > > > > > > > later >> > > > > > > > > > > > on >> > > > > > > > > > > > > > since it belongs to the state value and adding a >> > new >> > > > > table >> > > > > > > type >> > > > > > > > > > (like >> > > > > > > > > > > > you >> > > > > > > > > > > > > > suggested similar to PG [1]) >> > > > > > > > > > > > > > for metadata. Please see how Spark does that too >> > [2]. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > If you have better approach then please >> elaborate >> > > with >> > > > > more >> > > > > > > > > details >> > > > > > > > > > > and >> > > > > > > > > > > > > > help me to understand your point. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Up until now we've seen even in TB savepoints >> > that >> > > > the >> > > > > > > number >> > > > > > > > > of >> > > > > > > > > > > keys >> > > > > > > > > > > > > can >> > > > > > > > > > > > > > > be extremely huge but not the per key state >> > itself. >> > > > > > > > > > > > > > > But again, this is a good feature as-is and >> can >> > be >> > > > > > handled >> > > > > > > > in a >> > > > > > > > > > > > > separate >> > > > > > > > > > > > > > > jira. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > I've just created >> > > > > > > > > > https://issues.apache.org/jira/browse/FLINK-37456. >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > [1] >> > > > > > > > https://www.postgresql.org/docs/current/view-pg-tables.html >> > > > > > > > > > > > > > [2] >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > BR, >> > > > > > > > > > > > > > G >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > On Tue, Mar 11, 2025 at 3:55 AM Shengkai Fang < >> > > > > > > > fskm...@gmail.com >> > > > > > > > > > >> > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Hi, Gabor. Thanks for your response. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 1. State TTL for Value Columns >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Thank you for addressing the limitations here. >> > > > > However, I >> > > > > > > > > believe >> > > > > > > > > > > it >> > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > be beneficial to further clarify the API in >> this >> > > FLIP >> > > > > > > > regarding >> > > > > > > > > > how >> > > > > > > > > > > > > users >> > > > > > > > > > > > > > > can specify the TTL column. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > One potential approach that comes to mind is >> > using >> > > a >> > > > > > > > > standardized >> > > > > > > > > > > > > naming >> > > > > > > > > > > > > > > convention such as ${state-name}_ttl for the >> > > metadata >> > > > > > > column >> > > > > > > > > that >> > > > > > > > > > > > > defines >> > > > > > > > > > > > > > > the TTL value. In terms of implementation, the >> > > > > > > > > > listReadableMetadata >> > > > > > > > > > > > > > > function could: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > 1. Read the table’s columns and configuration, >> > > > > > > > > > > > > > > 2. Extract all defined state names, and >> > > > > > > > > > > > > > > 3. Return a structured list of metadata >> entries >> > > > > formatted >> > > > > > > as >> > > > > > > > > > > > > > > ${state-name}_ttl. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > WDYT? >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > 2. Adding a new connector with >> > > `savepoint-metadata` >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Introducing a new connector type at this stage >> > may >> > > > > > > > > unnecessarily >> > > > > > > > > > > > > > complicate >> > > > > > > > > > > > > > > the system. Given that every table already >> > belongs >> > > > to a >> > > > > > > > > Catalog, >> > > > > > > > > > > > which >> > > > > > > > > > > > > is >> > > > > > > > > > > > > > > designed to provide a Factory for building >> source >> > > or >> > > > > sink >> > > > > > > > > > > > connectors, I >> > > > > > > > > > > > > > > propose integrating a dedicated StateCatalog >> > > instead. >> > > > > > This >> > > > > > > > > > approach >> > > > > > > > > > > > > would >> > > > > > > > > > > > > > > allow us to: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > 1. Leverage the Catalog’s existing >> capabilities >> > to >> > > > > manage >> > > > > > > TTL >> > > > > > > > > > > > metadata >> > > > > > > > > > > > > > > (e.g., state names and TTL logic) without >> > > duplicating >> > > > > > > > > > > functionality. >> > > > > > > > > > > > > > > 2. Provide a unified interface for connector >> > > > > > instantiation >> > > > > > > > and >> > > > > > > > > > > > metadata >> > > > > > > > > > > > > > > handling through the Catalog’s Factory >> pattern. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Would this design decision better align with >> our >> > > > > > > > architecture’s >> > > > > > > > > > > > > > > extensibility and reduce redundancy? >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Up until now we've seen even in TB >> savepoints >> > > that >> > > > > the >> > > > > > > > number >> > > > > > > > > > of >> > > > > > > > > > > > keys >> > > > > > > > > > > > > > can >> > > > > > > > > > > > > > > > be extremely huge but not the per key state >> > > itself. >> > > > > > > > > > > > > > > > But again, this is a good feature as-is and >> can >> > > be >> > > > > > > handled >> > > > > > > > > in a >> > > > > > > > > > > > > > separate >> > > > > > > > > > > > > > > > jira. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > +1 for a separate jira. >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > Shengkai >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> >> > > > > 于2025年3月10日周一 >> > > > > > > > > 19:05写道: >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Hi Shengkai, >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Please see my comments inline. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > BR, >> > > > > > > > > > > > > > > > G >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > On Mon, Mar 3, 2025 at 7:07 AM Shengkai >> Fang < >> > > > > > > > > > fskm...@gmail.com> >> > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Hi, Gabor. Thanks for your the FLIP. I >> have >> > > some >> > > > > > > > questions >> > > > > > > > > > > about >> > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > FLIP: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > 1. State TTL for Value Columns >> > > > > > > > > > > > > > > > > How can users retrieve the state TTL >> > > > (Time-to-Live) >> > > > > > for >> > > > > > > > > each >> > > > > > > > > > > > value >> > > > > > > > > > > > > > > > column? >> > > > > > > > > > > > > > > > > From my understanding of the current >> design, >> > it >> > > > > seems >> > > > > > > > that >> > > > > > > > > > this >> > > > > > > > > > > > > > > > > functionality is not supported. Could you >> > > clarify >> > > > > if >> > > > > > > > there >> > > > > > > > > > are >> > > > > > > > > > > > > plans >> > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > address this limitation? >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Since the state processor API is not yet >> > exposing >> > > > > this >> > > > > > > > > > > information >> > > > > > > > > > > > > this >> > > > > > > > > > > > > > > > would require several steps. >> > > > > > > > > > > > > > > > First, the state processor API support >> needs to >> > > be >> > > > > > added >> > > > > > > > > which >> > > > > > > > > > > can >> > > > > > > > > > > > be >> > > > > > > > > > > > > > > then >> > > > > > > > > > > > > > > > exposed on the SQL API. >> > > > > > > > > > > > > > > > This is definitely a future improvement >> which >> > is >> > > > > useful >> > > > > > > and >> > > > > > > > > can >> > > > > > > > > > > be >> > > > > > > > > > > > > > > handled >> > > > > > > > > > > > > > > > in a separate jira. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > 2. Metadata Table vs. Metadata Column >> > > > > > > > > > > > > > > > > The metadata information described in the >> > FLIP >> > > > > > appears >> > > > > > > to >> > > > > > > > > be >> > > > > > > > > > > > > intended >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > describe the state files stored at a >> specific >> > > > > > location. >> > > > > > > > To >> > > > > > > > > > me, >> > > > > > > > > > > > this >> > > > > > > > > > > > > > > > concept >> > > > > > > > > > > > > > > > > aligns more closely with system tables >> like >> > > > > pg_tables >> > > > > > > in >> > > > > > > > > > > > PostgreSQL >> > > > > > > > > > > > > > [1] >> > > > > > > > > > > > > > > > or >> > > > > > > > > > > > > > > > > the INFORMATION_SCHEMA in MySQL [2]. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Adding a new connector with >> > `savepoint-metadata` >> > > > is a >> > > > > > > > > > possibility >> > > > > > > > > > > > > where >> > > > > > > > > > > > > > > we >> > > > > > > > > > > > > > > > can create such functionality. >> > > > > > > > > > > > > > > > I'm not against that, just want to have a >> > common >> > > > > > > agreement >> > > > > > > > > that >> > > > > > > > > > > we >> > > > > > > > > > > > > > would >> > > > > > > > > > > > > > > > like to move that direction. >> > > > > > > > > > > > > > > > (As a side note not just PG but Spark also >> has >> > > > > similar >> > > > > > > > > approach >> > > > > > > > > > > > and I >> > > > > > > > > > > > > > > > basically like the idea). >> > > > > > > > > > > > > > > > If we would go that direction savepoint >> > metadata >> > > > can >> > > > > be >> > > > > > > > > reached >> > > > > > > > > > > in >> > > > > > > > > > > > a >> > > > > > > > > > > > > > way >> > > > > > > > > > > > > > > > that one row would represent >> > > > > > > > > > > > > > > > an operator with it's values something like >> > this: >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│ >> > > > > > > > > > > > > > > > │ame │id │ash │sm >> │elism >> > > > > > > > > > > > > > > > │atesCount│orStateSi│tesSizeI│ >> > > > > > > > > > > > > > > > │ │ │ │ │ >> > > │ >> > > > > > > > > > > > > > > > │zeInBytes│nBytes │ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ >> > > > > > > > > > > > > > > > │Source: │datagen-s│47aee9439│2 │128 >> > > > │2 >> > > > > > > > > │16 >> > > > > > > > > > > > > > > > │546 │ >> > > > > > > > > > > > > > > > │datagen-s│ource-uid│4d6ea26e2│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ource │ │d544bef0a│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ │ │37bb5 │ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ >> > > > > > > > > > > > > > > > │long-udf-│long-udf-│6ed3f40bf│2 │128 >> > > > │2 >> > > > > > > > > │0 >> > > > > > > > > > > > > > │0 >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │with-mast│with-mast│f3c8dfcdf│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │er-hook │er-hook-u│cb95128a1│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ │id │018f1 │ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ >> > > > > > > > > > > > > > > > │value-pro│value-pro│ca4f5fe9a│2 │128 >> > > > │2 >> > > > > > > > > │0 >> > > > > > > > > > > > > > > > │40726 │ >> > > > > > > > > > > > > > > > │cess │cess-uid │637b656f0│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ │ │9ea78b3e7│ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ │ │a15b9 │ │ >> > > │ >> > > > > > > > │ >> > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > │ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > This table can then be joined with the >> actually >> > > > > > existing >> > > > > > > > > > > > `savepoint` >> > > > > > > > > > > > > > > > connector created tables based on UID hash >> > (which >> > > > is >> > > > > > > unique >> > > > > > > > > and >> > > > > > > > > > > > > always >> > > > > > > > > > > > > > > > exists). >> > > > > > > > > > > > > > > > This would mean that the already existing >> table >> > > > would >> > > > > > > need >> > > > > > > > > > only a >> > > > > > > > > > > > > > single >> > > > > > > > > > > > > > > > metadata column which is the UID hash. >> > > > > > > > > > > > > > > > WDYT? >> > > > > > > > > > > > > > > > @zakelly, plz share your thoughts too. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > If we opt to use metadata columns, every >> > record >> > > > in >> > > > > > the >> > > > > > > > > table >> > > > > > > > > > > > would >> > > > > > > > > > > > > > end >> > > > > > > > > > > > > > > up >> > > > > > > > > > > > > > > > > having identical values for these columns >> > > (please >> > > > > > > correct >> > > > > > > > > me >> > > > > > > > > > if >> > > > > > > > > > > > I’m >> > > > > > > > > > > > > > > > > mistaken). On the other hand, the state >> > > connector >> > > > > > > > requires >> > > > > > > > > > > users >> > > > > > > > > > > > to >> > > > > > > > > > > > > > > > specify >> > > > > > > > > > > > > > > > > an operator UID or operator UID hash, >> after >> > > which >> > > > > it >> > > > > > > > > outputs >> > > > > > > > > > > > > > > user-defined >> > > > > > > > > > > > > > > > > values in its records. This approach feels >> > > > somewhat >> > > > > > > > > redundant >> > > > > > > > > > > to >> > > > > > > > > > > > > me. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > If we would add a new `savepoint-metadata` >> > > > connector >> > > > > > then >> > > > > > > > > this >> > > > > > > > > > > can >> > > > > > > > > > > > be >> > > > > > > > > > > > > > > > addressed. >> > > > > > > > > > > > > > > > On the other hand UID and UID hash are >> having >> > > > > either-or >> > > > > > > > > > > > relationship >> > > > > > > > > > > > > > from >> > > > > > > > > > > > > > > > config perspective, >> > > > > > > > > > > > > > > > so when a user provides the UID then he/she >> can >> > > be >> > > > > > > > interested >> > > > > > > > > > in >> > > > > > > > > > > > the >> > > > > > > > > > > > > > hash >> > > > > > > > > > > > > > > > for further calculations >> > > > > > > > > > > > > > > > (the whole Flink internals are depending on >> the >> > > > > hash). >> > > > > > > > > Printing >> > > > > > > > > > > out >> > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > human readable UID >> > > > > > > > > > > > > > > > is an explicit requirement from the user >> side >> > > > because >> > > > > > > > hashes >> > > > > > > > > > are >> > > > > > > > > > > > not >> > > > > > > > > > > > > > > human >> > > > > > > > > > > > > > > > readable. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > 3. Handling LIST and MAP States in the >> State >> > > > > > Connector >> > > > > > > > > > > > > > > > > I have concerns about how the current >> design >> > > > > handles >> > > > > > > LIST >> > > > > > > > > and >> > > > > > > > > > > MAP >> > > > > > > > > > > > > > > states. >> > > > > > > > > > > > > > > > > Specifically, the state connector uses >> Flink >> > > > SQL’s >> > > > > > MAP >> > > > > > > > and >> > > > > > > > > > > ARRAY >> > > > > > > > > > > > > > types, >> > > > > > > > > > > > > > > > > which implies that it attempts to load >> entire >> > > MAP >> > > > > or >> > > > > > > LIST >> > > > > > > > > > > states >> > > > > > > > > > > > > into >> > > > > > > > > > > > > > > > > memory. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > However, in many real-world scenarios, >> these >> > > > states >> > > > > > can >> > > > > > > > > grow >> > > > > > > > > > > very >> > > > > > > > > > > > > > > large. >> > > > > > > > > > > > > > > > > Typically, the state API addresses this by >> > > > > providing >> > > > > > an >> > > > > > > > > > > iterator >> > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > traverse elements within the state >> > > incrementally. >> > > > > I’m >> > > > > > > > > unsure >> > > > > > > > > > > > > whether >> > > > > > > > > > > > > > > I’ve >> > > > > > > > > > > > > > > > > missed something in FLIP-496 or FLIP-512, >> but >> > > it >> > > > > > seems >> > > > > > > > that >> > > > > > > > > > the >> > > > > > > > > > > > > > current >> > > > > > > > > > > > > > > > > design might struggle with scalability in >> > such >> > > > > cases. >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > You see it good, the current implementation >> > keeps >> > > > > state >> > > > > > > > for a >> > > > > > > > > > > > single >> > > > > > > > > > > > > > key >> > > > > > > > > > > > > > > in >> > > > > > > > > > > > > > > > memory. >> > > > > > > > > > > > > > > > Back in the days we've considered this >> > potential >> > > > > issue >> > > > > > > and >> > > > > > > > > > > > concluded >> > > > > > > > > > > > > > that >> > > > > > > > > > > > > > > > this is not necessarily >> > > > > > > > > > > > > > > > needed for the initial version and can be >> done >> > > as a >> > > > > > later >> > > > > > > > > > > > > improvement. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > Up until now we've seen even in TB >> savepoints >> > > that >> > > > > the >> > > > > > > > number >> > > > > > > > > > of >> > > > > > > > > > > > keys >> > > > > > > > > > > > > > can >> > > > > > > > > > > > > > > > be extremely huge but not the per key state >> > > itself. >> > > > > > > > > > > > > > > > But again, this is a good feature as-is and >> can >> > > be >> > > > > > > handled >> > > > > > > > > in a >> > > > > > > > > > > > > > separate >> > > > > > > > > > > > > > > > jira. >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > > > Shengkai >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > [1] >> > > > > > > > > > > >> > > https://www.postgresql.org/docs/current/view-pg-tables.html >> > > > > > > > > > > > > > > > > [2] >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> >> > > > > > 于2025年3月3日周一 >> > > > > > > > > > > 02:00写道: >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > Hi Zakelly, >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > In order to shoot for simplicity >> `METADATA >> > > > > VIRTUAL` >> > > > > > > as >> > > > > > > > > key >> > > > > > > > > > > > words >> > > > > > > > > > > > > > for >> > > > > > > > > > > > > > > > > > definition is the target. >> > > > > > > > > > > > > > > > > > When it's not super complex the latter >> can >> > be >> > > > > added >> > > > > > > > too. >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > BR, >> > > > > > > > > > > > > > > > > > G >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > On Sun, Mar 2, 2025 at 3:37 PM Zakelly >> Lan >> > < >> > > > > > > > > > > > > zakelly....@gmail.com> >> > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Hi Gabor, >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > +1 for this. >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Will the metadata column use `METADATA >> > > > VIRTUAL` >> > > > > > as >> > > > > > > > key >> > > > > > > > > > > words >> > > > > > > > > > > > > for >> > > > > > > > > > > > > > > > > > > definition, or `METADATA FROM xxx >> > VIRTUAL` >> > > > for >> > > > > > > > > renaming, >> > > > > > > > > > > just >> > > > > > > > > > > > > > like >> > > > > > > > > > > > > > > > the >> > > > > > > > > > > > > > > > > > > Kafka table? >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > Best, >> > > > > > > > > > > > > > > > > > > Zakelly >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > On Sat, Mar 1, 2025 at 1:31 PM Gabor >> > > Somogyi >> > > > < >> > > > > > > > > > > > > > > > > gabor.g.somo...@gmail.com> >> > > > > > > > > > > > > > > > > > > wrote: >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > Hi All, >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > I'd like to start a discussion of >> > > FLIP-512: >> > > > > Add >> > > > > > > > meta >> > > > > > > > > > > > > > information >> > > > > > > > > > > > > > > to >> > > > > > > > > > > > > > > > > SQL >> > > > > > > > > > > > > > > > > > > > state connector [1]. >> > > > > > > > > > > > > > > > > > > > Feel free to add your thoughts to >> make >> > > this >> > > > > > > feature >> > > > > > > > > > > better. >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > [1] >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector