Just to avoid jumping from one thread to another here's Timo's suggestion: "if I understand the discussion correctly, you want to use a PTF without table arguments to return a table (read from savepoint metadata)? If this is the case, you don't need a PTF for it. A regular table function can also do the job. IIRC we support TVF with constant args."
I've tried the TVF out and works with batch mode like charm. > I'd also suggest we make it built-in without registration. I basically agree to have this as built-in. >From my perspective there are no further questions/concerns and updated the FLIP accordingly. If somebody has then please share, otherwise I would like to go on with the vote. All in all thanks for everybody for the constructive suggestions, we've made things really better. BR, G On Fri, Mar 28, 2025 at 9:19 AM Zakelly Lan <zakelly....@gmail.com> wrote: > Hi all, > > Given the simplicity, I also +1 for PTF or any other function > implementation if PTF is not applicable for this. > > I would like to raise a consideration regarding the usage implementation: > > Would it be necessary to allow users to utilize the CREATE FUNCTION > > statement for registering the PTF? > > > I'd also suggest we make it built-in without registration. > > Currently, Flink SQL supports letting external systems register modules and > > leverage these modules to centrally manage all function definitions. > Given > > this architectural approach, I’m curious if the plan involves introducing > > additional functions in the future. If so, I would advocate for > introducing > > a dedicated state module to centralize such management. This would > empower > > users to: > > > I can’t think of any further functions for now, but I'd +1 for a module if > it could omit the registration. > > > Best, > Zakelly. > > > > On Fri, Mar 28, 2025 at 10:25 AM Shengkai Fang <fskm...@gmail.com> wrote: > > > One more question about the FLIP. > > > > I think the output schema is definitely a public API to users. If users > > use the `CREATE FUNCTION` statement, is it means the class path is also a > > public API to users. Alternatively, this is merely an experimental > feature > > and we don't have any promise about this function. > > > > Best, > > Shengkai > > > > Shengkai Fang <fskm...@gmail.com> 于2025年3月28日周五 10:20写道: > > > >> +1 to use PTF. > >> > >> I would like to raise a consideration regarding the usage > implementation: > >> Would it be necessary to allow users to utilize the CREATE FUNCTION > >> statement for registering the PTF? > >> > >> Currently, Flink SQL supports letting external systems register modules > >> and leverage these modules to centrally manage all function definitions. > >> Given this architectural approach, I’m curious if the plan involves > >> introducing additional functions in the future. If so, I would advocate > for > >> introducing a dedicated state module to centralize such management. This > >> would empower users to: > >> > >> 1. Simply execute the LOAD MODULE command to load the required module, > and > >> 2. Directly invoke read_metadata thereafter. > >> > >> For more details about the module, please refer to this document[1]. > >> > >> Best, > >> Shengkai > >> > >> [1] > >> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/modules/ > >> > >> Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月28日周五 00:26写道: > >> > >>> Just found out that PTF in batch mode is not supported, plz see the dev > >>> mailing about it [1]. > >>> > >>> [1] https://lists.apache.org/thread/ytm9m1qt4pq2q2gjngfktrn8vrlvkf07 > >>> > >>> BR, > >>> G > >>> > >>> > >>> On Thu, Mar 27, 2025 at 3:38 PM Gabor Somogyi < > gabor.g.somo...@gmail.com > >>> > > >>> wrote: > >>> > >>> > In the meantime I've just updated the FLIP according to this to be > >>> > optimistic 🙂 > >>> > > >>> > BR, > >>> > G > >>> > > >>> > On Thu, Mar 27, 2025 at 2:15 PM Gabor Somogyi < > >>> gabor.g.somo...@gmail.com> > >>> > wrote: > >>> > > >>> >> Considering all the facts I also +1 on PTF. Even if something is > >>> missing > >>> >> we can add later. > >>> >> > >>> >> @Zakelly Lan <zakelly....@gmail.com> @Shengkai Fang are you also on > >>> the > >>> >> same page or have something to add? > >>> >> > >>> >> BR, > >>> >> G > >>> >> > >>> >> > >>> >> On Thu, Mar 27, 2025 at 1:50 PM Lincoln Lee <lincoln.8...@gmail.com > > > >>> >> wrote: > >>> >> > >>> >>> +1 for PTF > >>> >>> > >>> >>> > Is it possible to describe such function to see the column > >>> names/types? > >>> >>> > >>> >>> Although Flink SQL does not directly support this feature, users > can > >>> >>> achieve > >>> >>> similar results with the help of `explain` syntax, e.g. > >>> >>> 'explain select * from read_state_metadata(...)' > >>> >>> > >>> >>> > >>> >>> Best, > >>> >>> Lincoln Lee > >>> >>> > >>> >>> > >>> >>> Gyula Fóra <gyula.f...@gmail.com> 于2025年3月27日周四 20:41写道: > >>> >>> > >>> >>> > Hey! > >>> >>> > > >>> >>> > I think the PTF approach strikes a great balance in simplicity > and > >>> the > >>> >>> > capabilities that we get out of it. > >>> >>> > > >>> >>> > I think this could be a completely viable alternative to the > >>> dedicated > >>> >>> > connector, +1. > >>> >>> > > >>> >>> > Cheers, > >>> >>> > Gyula > >>> >>> > > >>> >>> > On Thu, Mar 27, 2025 at 10:37 AM Shengkai Fang < > fskm...@gmail.com> > >>> >>> wrote: > >>> >>> > > >>> >>> > > Hi, Gabor. > >>> >>> > > > >>> >>> > > > Do I understand correctly that this is 2.x only feature and > we > >>> >>> can't > >>> >>> > > backport it to 1.x line > >>> >>> > > > >>> >>> > > Yes. PTF is only supported in 2.x verison. > >>> >>> > > > >>> >>> > > > Is it possible to describe such function to see the column > >>> >>> names/types? > >>> >>> > > > >>> >>> > > Flink SQL doesn't support this feature, but postgres[2] or > >>> mysql[1] > >>> >>> has > >>> >>> > > similar feature. > >>> >>> > > > >>> >>> > > [1] > >>> >>> https://dev.mysql.com/doc/refman/8.4/en/show-create-procedure.html > >>> >>> > > [2] > >>> >>> > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://stackoverflow.com/questions/6898453/show-the-code-of-a-function-procedure-and-trigger-in-postgresql > >>> >>> > > > >>> >>> > > Best, > >>> >>> > > Shengkai > >>> >>> > > > >>> >>> > > > >>> >>> > > Gabor Somogyi <gabor.g.somo...@gmail.com> 于2025年3月27日周四 > 16:25写道: > >>> >>> > > > >>> >>> > > > Hi Shengkai, > >>> >>> > > > > >>> >>> > > > Thanks for your effort with the example, this looks > promising. > >>> >>> > > > I like the fact that users wouldn't need to sweat with > complex > >>> >>> create > >>> >>> > > table > >>> >>> > > > statements. > >>> >>> > > > > >>> >>> > > > Couple of questions: > >>> >>> > > > * Do I understand correctly that this is 2.x only feature and > >>> we > >>> >>> can't > >>> >>> > > > backport it to 1.x line? > >>> >>> > > > I'm not intended to do any backport, just would like to know > >>> the > >>> >>> > > technical > >>> >>> > > > constraints. > >>> >>> > > > * Is it possible to describe such function to see the column > >>> >>> > names/types? > >>> >>> > > > > >>> >>> > > > BR, > >>> >>> > > > G > >>> >>> > > > > >>> >>> > > > > >>> >>> > > > On Thu, Mar 27, 2025 at 3:17 AM Shengkai Fang < > >>> fskm...@gmail.com> > >>> >>> > wrote: > >>> >>> > > > > >>> >>> > > > > Many thanks for your reminder, Leonard. Here's the link I > >>> >>> > mentioned[1]. > >>> >>> > > > > > >>> >>> > > > > Best, > >>> >>> > > > > Shengkai > >>> >>> > > > > > >>> >>> > > > > [1] https://github.com/apache/flink/pull/26358 > >>> >>> > > > > > >>> >>> > > > > Leonard Xu <xbjt...@gmail.com> 于2025年3月27日周四 10:05写道: > >>> >>> > > > > > >>> >>> > > > > > Your link is broken, Shengkai > >>> >>> > > > > > > >>> >>> > > > > > Best, > >>> >>> > > > > > Leonard > >>> >>> > > > > > > >>> >>> > > > > > > 2025年3月27日 10:01,Shengkai Fang <fskm...@gmail.com> 写道: > >>> >>> > > > > > > > >>> >>> > > > > > > Hi, All. > >>> >>> > > > > > > > >>> >>> > > > > > > I write a simple demo to illustrate my idea. Hope this > >>> helps. > >>> >>> > > > > > > > >>> >>> > > > > > > Best, > >>> >>> > > > > > > Shengkai > >>> >>> > > > > > > > >>> >>> > > > > > > > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://github.com/apache/flink/compare/master...fsk119:flink:example?expand=1 > >>> >>> > > > > > > > >>> >>> > > > > > > Gabor Somogyi <gabor.g.somo...@gmail.com> > 于2025年3月26日周三 > >>> >>> 15:54写道: > >>> >>> > > > > > > > >>> >>> > > > > > >>> I'm fine with a seperate SQL connector for metadata, > so > >>> >>> maybe > >>> >>> > we > >>> >>> > > > > could > >>> >>> > > > > > >> update the FLIP about our discussion? > >>> >>> > > > > > >> > >>> >>> > > > > > >> Sorry, I've forgotten this part. Yeah, no matter we > >>> choose > >>> >>> I'm > >>> >>> > > going > >>> >>> > > > > to > >>> >>> > > > > > >> update the FLIP. > >>> >>> > > > > > >> > >>> >>> > > > > > >> G > >>> >>> > > > > > >> > >>> >>> > > > > > >> > >>> >>> > > > > > >> On Wed, Mar 26, 2025 at 8:51 AM Gabor Somogyi < > >>> >>> > > > > > gabor.g.somo...@gmail.com> > >>> >>> > > > > > >> wrote: > >>> >>> > > > > > >> > >>> >>> > > > > > >>> Hi All, > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> I've also lack of the knowledge of PTF so I've read > >>> just > >>> >>> the > >>> >>> > > > > motivation > >>> >>> > > > > > >>> part: > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> "The SQL 2016 standard introduced a way of defining > >>> custom > >>> >>> SQL > >>> >>> > > > > > operators > >>> >>> > > > > > >>> defined by ISO/IEC 19075-7:2021 (Part 7: Polymorphic > >>> table > >>> >>> > > > > functions). > >>> >>> > > > > > >>> ~200 pages define how this new kind of function can > >>> >>> consume and > >>> >>> > > > > produce > >>> >>> > > > > > >>> tables with various execution properties. > >>> >>> > > > > > >>> Unfortunately, this part of the standard is not > >>> publicly > >>> >>> > > > available." > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> Of course we can take a look at some examples but do > we > >>> >>> really > >>> >>> > > want > >>> >>> > > > > to > >>> >>> > > > > > >>> expose state data with this construct > >>> >>> > > > > > >>> which is described in ~200 pages and part of the > >>> standard > >>> >>> is > >>> >>> > not > >>> >>> > > > > > publicly > >>> >>> > > > > > >>> available? 🙂 > >>> >>> > > > > > >>> I mean the dataset is couple of rows and the use-case > >>> is > >>> >>> join > >>> >>> > > with > >>> >>> > > > > > >> another > >>> >>> > > > > > >>> table like with state data. > >>> >>> > > > > > >>> If somebody can give advantages I would buy that but > >>> from > >>> >>> my > >>> >>> > > > limited > >>> >>> > > > > > >>> understanding this would be an overkill here. > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> BR, > >>> >>> > > > > > >>> G > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> > >>> >>> > > > > > >>> On Wed, Mar 26, 2025 at 8:28 AM Gyula Fóra < > >>> >>> > gyula.f...@gmail.com > >>> >>> > > > > >>> >>> > > > > > wrote: > >>> >>> > > > > > >>> > >>> >>> > > > > > >>>> Hi Zakelly , Shengkai! > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>>> I don't know too much about PTFs, it would be > >>> interesting > >>> >>> to > >>> >>> > see > >>> >>> > > > how > >>> >>> > > > > > the > >>> >>> > > > > > >>>> usage would look in practice. > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>>> Do you have some mockup/example in mind how the PTF > >>> would > >>> >>> look > >>> >>> > > for > >>> >>> > > > > > >> example > >>> >>> > > > > > >>>> when want to: > >>> >>> > > > > > >>>> - Simply display/aggregate whats in the metadata > >>> >>> > > > > > >>>> - Join keyed state with some metadata columns > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>>> Thanks > >>> >>> > > > > > >>>> Gyula > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>>> On Wed, Mar 26, 2025 at 7:33 AM Zakelly Lan < > >>> >>> > > > zakelly....@gmail.com> > >>> >>> > > > > > >>>> wrote: > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>>>> Hi everyone, > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>> I'm fine with a seperate SQL connector for > metadata, > >>> so > >>> >>> maybe > >>> >>> > > we > >>> >>> > > > > > could > >>> >>> > > > > > >>>>> update the FLIP about our discussion? And Shengkai > >>> >>> provides a > >>> >>> > > PTF > >>> >>> > > > > > >>>>> implementation, does that also meet the > requirement? > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>> Best, > >>> >>> > > > > > >>>>> Zakelly > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>> On Thu, Mar 20, 2025 at 4:47 PM Gabor Somogyi < > >>> >>> > > > > > >>>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>> wrote: > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>>> Hi All, > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>> @Zakelly: Gyula summarised it correctly what I > >>> meant so > >>> >>> > please > >>> >>> > > > > treat > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>> content as mine. > >>> >>> > > > > > >>>>>> As an addition I'm not against to add CLI at all, > >>> I'm > >>> >>> just > >>> >>> > > > stating > >>> >>> > > > > > >>>> that > >>> >>> > > > > > >>>>> in > >>> >>> > > > > > >>>>>> some cases like this, users would like to have > >>> >>> > > > > > >>>>>> a self-serving solution where they can provide SQL > >>> >>> > statements > >>> >>> > > > > which > >>> >>> > > > > > >>>> can > >>> >>> > > > > > >>>>>> trigger alerts automatically. > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>> My personal opinion is that CLI would be > beneficial > >>> for > >>> >>> > > several > >>> >>> > > > > > >>>> cases. A > >>> >>> > > > > > >>>>>> good example is when users want to restart job > >>> >>> > > > > > >>>>>> from specific Kafka offsets which are persisted > in a > >>> >>> > > savepoint. > >>> >>> > > > > For > >>> >>> > > > > > >>>> such > >>> >>> > > > > > >>>>>> scenario users are more than happy since they > >>> >>> > > > > > >>>>>> expect manual intervention with full control. So > >>> all in > >>> >>> all > >>> >>> > > one > >>> >>> > > > > can > >>> >>> > > > > > >>>> count > >>> >>> > > > > > >>>>>> on my +1 when CLI FLIP would come up... > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>> BR, > >>> >>> > > > > > >>>>>> G > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>> On Thu, Mar 20, 2025 at 8:20 AM Gyula Fóra < > >>> >>> > > > gyula.f...@gmail.com> > >>> >>> > > > > > >>>> wrote: > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>>>> Hi! > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>> @Zakelly Lan <zakelly....@gmail.com> > >>> >>> > > > > > >>>>>>> I think what Gabor means is that users want to > have > >>> >>> > > predefined > >>> >>> > > > > SQL > >>> >>> > > > > > >>>>> scripts > >>> >>> > > > > > >>>>>>> to perform state analysis tasks to debug/identify > >>> >>> problems. > >>> >>> > > > > > >>>>>>> Such as write a SQL script that joins the > metadata > >>> >>> table > >>> >>> > with > >>> >>> > > > the > >>> >>> > > > > > >>>> state > >>> >>> > > > > > >>>>>>> and > >>> >>> > > > > > >>>>>>> do some analytics on it. > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>> If we have a meta table then the SQL script that > >>> can do > >>> >>> > this > >>> >>> > > is > >>> >>> > > > > > >> fixed > >>> >>> > > > > > >>>>> and > >>> >>> > > > > > >>>>>>> users can trigger this on demand by simply > >>> providing a > >>> >>> new > >>> >>> > > > > > >> savepoint > >>> >>> > > > > > >>>>> path. > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>> If we have a different mechanism to extract > >>> metadata > >>> >>> that > >>> >>> > is > >>> >>> > > > not > >>> >>> > > > > > >> SQL > >>> >>> > > > > > >>>>>>> native > >>> >>> > > > > > >>>>>>> then manual steps need to be executed and a > custom > >>> SQL > >>> >>> > script > >>> >>> > > > > would > >>> >>> > > > > > >>>> need > >>> >>> > > > > > >>>>>>> to > >>> >>> > > > > > >>>>>>> be written that adds the manually extracted > >>> metadata > >>> >>> into > >>> >>> > the > >>> >>> > > > > > >> script. > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>> Cheers, > >>> >>> > > > > > >>>>>>> Gyula > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>> On Thu, Mar 20, 2025 at 4:32 AM Zakelly Lan < > >>> >>> > > > > zakelly....@gmail.com > >>> >>> > > > > > >>> > >>> >>> > > > > > >>>>>>> wrote: > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>>>> Hi all, > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> Thanks for your answers! Getting everyone > aligned > >>> on > >>> >>> this > >>> >>> > > > topic > >>> >>> > > > > > >> is > >>> >>> > > > > > >>>>>>>> challenging, but it’s definitely worth the > effort > >>> >>> since it > >>> >>> > > > will > >>> >>> > > > > > >>>> help > >>> >>> > > > > > >>>>>>>> streamline things moving forward. > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> @Gabor are you saying that users are using some > >>> >>> scripts to > >>> >>> > > > > define > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>> SQL > >>> >>> > > > > > >>>>>>>> metadata connector and get the information, > >>> right? If > >>> >>> so, > >>> >>> > > > would > >>> >>> > > > > a > >>> >>> > > > > > >>>> CLI > >>> >>> > > > > > >>>>>>> tool > >>> >>> > > > > > >>>>>>>> be more convenient? It's easy to invoke and can > >>> get > >>> >>> the > >>> >>> > > result > >>> >>> > > > > > >>>>> swiftly. > >>> >>> > > > > > >>>>>>> And > >>> >>> > > > > > >>>>>>>> there should be some other systems to track the > >>> >>> checkpoint > >>> >>> > > > > > >> lineage > >>> >>> > > > > > >>>> and > >>> >>> > > > > > >>>>>>>> analyze if there are outliers in metadata (e.g. > >>> state > >>> >>> size > >>> >>> > > of > >>> >>> > > > > one > >>> >>> > > > > > >>>>>>> operator) > >>> >>> > > > > > >>>>>>>> right? Well, maybe I missed something so please > >>> >>> correct me > >>> >>> > > if > >>> >>> > > > > I'm > >>> >>> > > > > > >>>>> wrong. > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> I think the overall vision in Flink SQL is to > >>> provide > >>> >>> a > >>> >>> > SQL > >>> >>> > > > > > >> native > >>> >>> > > > > > >>>>>>>>> environment where we can serve complex > use-cases > >>> >>> like you > >>> >>> > > > would > >>> >>> > > > > > >>>>> expect > >>> >>> > > > > > >>>>>>>> in a > >>> >>> > > > > > >>>>>>>>> regular database. > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> @Gyula Well, this is a good point. From the > >>> >>> perspective of > >>> >>> > > > > > >>>>> comprehensive > >>> >>> > > > > > >>>>>>>> SQL experience, I'd +1 for treating metadata as > >>> data. > >>> >>> > > > Although I > >>> >>> > > > > > >>>> doubt > >>> >>> > > > > > >>>>>>> if > >>> >>> > > > > > >>>>>>>> there is a need for processing metadata, I won't > >>> be > >>> >>> > against > >>> >>> > > a > >>> >>> > > > > > >>>> separate > >>> >>> > > > > > >>>>>>>> connector. > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> Regarding the CLI tool, I still think it’s worth > >>> >>> > > implementing. > >>> >>> > > > > > >>>> Such a > >>> >>> > > > > > >>>>>>> tool > >>> >>> > > > > > >>>>>>>> could provide savepoint information before > >>> resuming > >>> >>> from a > >>> >>> > > > > > >>>> savepoint, > >>> >>> > > > > > >>>>>>> which > >>> >>> > > > > > >>>>>>>> would enhance the user experience in CLI-based > >>> >>> workflows. > >>> >>> > It > >>> >>> > > > > > >> would > >>> >>> > > > > > >>>> be > >>> >>> > > > > > >>>>>>> good > >>> >>> > > > > > >>>>>>>> if someone could implement this feature. We > >>> shouldn’t > >>> >>> > worry > >>> >>> > > > > about > >>> >>> > > > > > >>>>>>> whether > >>> >>> > > > > > >>>>>>>> this tool might be retired in the future. > >>> Regardless > >>> >>> of > >>> >>> > the > >>> >>> > > > > > >>>> SQL-based > >>> >>> > > > > > >>>>>>>> solution we eventually adopt, this capability > will > >>> >>> remain > >>> >>> > > > > > >> essential > >>> >>> > > > > > >>>>> for > >>> >>> > > > > > >>>>>>> CLI > >>> >>> > > > > > >>>>>>>> users. This is another topic. > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> Best, > >>> >>> > > > > > >>>>>>>> Zakelly > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>> On Thu, Mar 20, 2025 at 10:37 AM Shengkai Fang < > >>> >>> > > > > > >> fskm...@gmail.com> > >>> >>> > > > > > >>>>>>> wrote: > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>>>> Hi. > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> After reading the doc[1], I think Spark > provides > >>> a > >>> >>> > function > >>> >>> > > > for > >>> >>> > > > > > >>>>> users > >>> >>> > > > > > >>>>>>> to > >>> >>> > > > > > >>>>>>>>> consume the metadata from the savepoint. In > >>> Flink > >>> >>> SQL, > >>> >>> > > > similar > >>> >>> > > > > > >>>>>>>>> functionality is implemented through > Polymorphic > >>> >>> Table > >>> >>> > > > > > >> Functions > >>> >>> > > > > > >>>>>>> (PTF) as > >>> >>> > > > > > >>>>>>>>> proposed in FLIP-440[2]. Below is a code > >>> example[3] > >>> >>> > > > > > >> illustrating > >>> >>> > > > > > >>>>> this > >>> >>> > > > > > >>>>>>>>> concept: > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> ``` > >>> >>> > > > > > >>>>>>>>> public static class ScalarArgsFunction > extends > >>> >>> > > > > > >>>>>>>>> TestProcessTableFunctionBase { > >>> >>> > > > > > >>>>>>>>> public void eval(Integer i, Boolean b) { > >>> >>> > > > > > >>>>>>>>> collectObjects(i, b); > >>> >>> > > > > > >>>>>>>>> } > >>> >>> > > > > > >>>>>>>>> } > >>> >>> > > > > > >>>>>>>>> ``` > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> ``` > >>> >>> > > > > > >>>>>>>>> INSERT INTO sink SELECT * FROM f(i => 42, b => > >>> >>> > CAST('TRUE' > >>> >>> > > AS > >>> >>> > > > > > >>>>>>> BOOLEAN)) > >>> >>> > > > > > >>>>>>>>> `` > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> So we can add a builtin function named > >>> >>> > > `read_state_metadata` > >>> >>> > > > to > >>> >>> > > > > > >>>> read > >>> >>> > > > > > >>>>>>>>> savepoint data. > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://docs.databricks.com/aws/en/structured-streaming/read-state?language=SQL > >>> >>> > > > > > >>>>>>>>> [2] > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=298781093 > >>> >>> > > > > > >>>>>>>>> [3] > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/test/java/org/apache/flink/table/planner/plan/nodes/exec/stream/ProcessTableFunctionTestPrograms.java#L140 > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>> Gyula Fóra <gyula.f...@gmail.com> > 于2025年3月19日周三 > >>> >>> 18:37写道: > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> Hi All! > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> Thank you for the answers and concerns from > >>> >>> everyone. > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> On the CLI vs State Metadata Connector/Table > >>> >>> question I > >>> >>> > > > would > >>> >>> > > > > > >>>> also > >>> >>> > > > > > >>>>>>> like > >>> >>> > > > > > >>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>> step back a little and look at the bigger > >>> picture. > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> I think the overall vision in Flink SQL is to > >>> >>> provide a > >>> >>> > > SQL > >>> >>> > > > > > >>>> native > >>> >>> > > > > > >>>>>>>>>> environment where we can serve complex > use-cases > >>> >>> like > >>> >>> > you > >>> >>> > > > > > >> would > >>> >>> > > > > > >>>>>>> expect > >>> >>> > > > > > >>>>>>>>> in a > >>> >>> > > > > > >>>>>>>>>> regular database. > >>> >>> > > > > > >>>>>>>>>> Most features, developments in the recent > years > >>> have > >>> >>> > gone > >>> >>> > > > > > >> this > >>> >>> > > > > > >>>>> way. > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> The State Metadata Table would be a natural > and > >>> >>> > > > > > >> straightforward > >>> >>> > > > > > >>>>> fit > >>> >>> > > > > > >>>>>>>> here. > >>> >>> > > > > > >>>>>>>>>> So from my side, +1 for that. > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> However I could understand if we are not ready > >>> to > >>> >>> add a > >>> >>> > > new > >>> >>> > > > > > >>>>>>>>>> connector/format due to maintenance concerns > >>> (and in > >>> >>> > > general > >>> >>> > > > > > >>>>> concern > >>> >>> > > > > > >>>>>>>>> about > >>> >>> > > > > > >>>>>>>>>> the design). > >>> >>> > > > > > >>>>>>>>>> If that's the issue then we should spend more > >>> time > >>> >>> on > >>> >>> > the > >>> >>> > > > > > >>>> design > >>> >>> > > > > > >>>>> to > >>> >>> > > > > > >>>>>>> get > >>> >>> > > > > > >>>>>>>>>> comfortable with the approach and seek > feedback > >>> >>> from the > >>> >>> > > > > > >> wider > >>> >>> > > > > > >>>>>>>> community > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> I am -1 for the CLI/tooling approach as that > >>> will > >>> >>> not > >>> >>> > > > provide > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>>> featureset we are looking for that is not > >>> already > >>> >>> > covered > >>> >>> > > by > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>> Java > >>> >>> > > > > > >>>>>>>>>> connector. And that approach would come with > the > >>> >>> same > >>> >>> > > > > > >>>> maintenance > >>> >>> > > > > > >>>>>>>>>> implications. > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> Cheers > >>> >>> > > > > > >>>>>>>>>> Gyula > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> On Wed, Mar 19, 2025 at 11:24 AM Gabor > Somogyi < > >>> >>> > > > > > >>>>>>>>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> Hi Zaklely, Shengkai > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> Several topics are going on so adding gist > >>> answers > >>> >>> to > >>> >>> > > them. > >>> >>> > > > > > >>>> When > >>> >>> > > > > > >>>>>>> some > >>> >>> > > > > > >>>>>>>>>> topic > >>> >>> > > > > > >>>>>>>>>>> is not touched please highlight it. > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> @Shengkai: I've read through all the previous > >>> FLIPs > >>> >>> > > related > >>> >>> > > > > > >>>>>>> catalogs > >>> >>> > > > > > >>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>> if > >>> >>> > > > > > >>>>>>>>>>> we would like to keep the concepts there > >>> >>> > > > > > >>>>>>>>>>> then one-to-one mapping relationship between > >>> >>> savepoint > >>> >>> > > and > >>> >>> > > > > > >>>>> catalog > >>> >>> > > > > > >>>>>>>> is a > >>> >>> > > > > > >>>>>>>>>>> reasonable direction. In short I'm happy that > >>> >>> > > > > > >>>>>>>>>>> you've highlighted this and agree as a whole. > >>> I've > >>> >>> > > written > >>> >>> > > > > > >> it > >>> >>> > > > > > >>>>> down > >>> >>> > > > > > >>>>>>>>>>> previously, just want to double confirm that > >>> state > >>> >>> > > catalog > >>> >>> > > > > > >> is > >>> >>> > > > > > >>>>>>>>>>> essential and planned. When we reach this > point > >>> >>> then > >>> >>> > your > >>> >>> > > > > > >>>> input > >>> >>> > > > > > >>>>> is > >>> >>> > > > > > >>>>>>>> more > >>> >>> > > > > > >>>>>>>>>>> than welcome. > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> @Zakelly: We've tried the CLI and separate > >>> library > >>> >>> > > > > > >> approaches > >>> >>> > > > > > >>>>> with > >>> >>> > > > > > >>>>>>>>> users > >>> >>> > > > > > >>>>>>>>>>> already and these are not something which is > >>> >>> welcome > >>> >>> > > > > > >> because > >>> >>> > > > > > >>>> of > >>> >>> > > > > > >>>>>>> the > >>> >>> > > > > > >>>>>>>>>>> following: > >>> >>> > > > > > >>>>>>>>>>> * Users want to have automated tasks and not > >>> manual > >>> >>> > > > > > >>>> CLI/library > >>> >>> > > > > > >>>>>>>> output > >>> >>> > > > > > >>>>>>>>>>> parsing. This can be hacked around but our > >>> >>> experience > >>> >>> > is > >>> >>> > > > > > >>>>> negative > >>> >>> > > > > > >>>>>>> on > >>> >>> > > > > > >>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>> because it's just brittle. > >>> >>> > > > > > >>>>>>>>>>> * From development perspective It's way much > >>> bigger > >>> >>> > > effort > >>> >>> > > > > > >>>> than > >>> >>> > > > > > >>>>> a > >>> >>> > > > > > >>>>>>>>>> connector > >>> >>> > > > > > >>>>>>>>>>> (hard to test, packaging/version handling is > >>> and > >>> >>> extra > >>> >>> > > > > > >> layer > >>> >>> > > > > > >>>> of > >>> >>> > > > > > >>>>>>>>>> complexity, > >>> >>> > > > > > >>>>>>>>>>> external FS authentication is pain for users, > >>> >>> expecting > >>> >>> > > > > > >> them > >>> >>> > > > > > >>>> to > >>> >>> > > > > > >>>>>>>>> download > >>> >>> > > > > > >>>>>>>>>>> savepoints also) > >>> >>> > > > > > >>>>>>>>>>> * Purely personal opinion but if we would > find > >>> >>> better > >>> >>> > > ways > >>> >>> > > > > > >>>> later > >>> >>> > > > > > >>>>>>> then > >>> >>> > > > > > >>>>>>>>>>> retire a CLI is not more lightweight than > >>> retire a > >>> >>> > > > > > >> connector > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> It would be great if you give some examples > >>> on how > >>> >>> > user > >>> >>> > > > > > >>>> could > >>> >>> > > > > > >>>>>>>>> leverage > >>> >>> > > > > > >>>>>>>>>>> the separate connector to process the > metadata. > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> The most simplest cases: > >>> >>> > > > > > >>>>>>>>>>> * give me the overgroving state uids > >>> >>> > > > > > >>>>>>>>>>> * give me the not known (new or renamed) > state > >>> uids > >>> >>> > > > > > >>>>>>>>>>> * give me the state uids where state size > >>> >>> drastically > >>> >>> > > > > > >> dropped > >>> >>> > > > > > >>>>>>> compare > >>> >>> > > > > > >>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>> previous savepoint (accidental state loss) > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> Since it was mentioned: as a general offtopic > >>> >>> teaser, > >>> >>> > > yeah > >>> >>> > > > > > >> it > >>> >>> > > > > > >>>>>>> would > >>> >>> > > > > > >>>>>>>> be > >>> >>> > > > > > >>>>>>>>>> good > >>> >>> > > > > > >>>>>>>>>>> to have some sort of checkpoint/savepoint > >>> lineage > >>> >>> or > >>> >>> > > > > > >> however > >>> >>> > > > > > >>>> we > >>> >>> > > > > > >>>>>>> call > >>> >>> > > > > > >>>>>>>>> it. > >>> >>> > > > > > >>>>>>>>>>> Since we've not yet reached this point there > >>> are no > >>> >>> > > > > > >> technical > >>> >>> > > > > > >>>>>>>> details, > >>> >>> > > > > > >>>>>>>>>> it's > >>> >>> > > > > > >>>>>>>>>>> more like a vision. It's a common pattern > that > >>> >>> > > > > > >>>>>>>>>>> jobs are physically running but somehow the > >>> state > >>> >>> > > > > > >> processing > >>> >>> > > > > > >>>> is > >>> >>> > > > > > >>>>>>> stuck > >>> >>> > > > > > >>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>> it would be good to add some way to find it > out > >>> >>> > > > > > >>>> automatically. > >>> >>> > > > > > >>>>>>>>>>> The important saying here is automation and > not > >>> >>> manual > >>> >>> > > > > > >>>>> evaluation > >>> >>> > > > > > >>>>>>>> since > >>> >>> > > > > > >>>>>>>>>>> handling 10k+ jobs is just not allowing that. > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> On Wed, Mar 19, 2025 at 6:46 AM Shengkai > Fang < > >>> >>> > > > > > >>>>> fskm...@gmail.com> > >>> >>> > > > > > >>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> Hi, All. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> About State Catalog, I want to share more > >>> thoughts > >>> >>> > about > >>> >>> > > > > > >>>> this. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> In the initial design concept, I understood > >>> that a > >>> >>> > > > > > >>>> savepoint > >>> >>> > > > > > >>>>>>> and a > >>> >>> > > > > > >>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>> catalog have a one-to-one mapping > >>> relationship. > >>> >>> Each > >>> >>> > > > > > >>>> operator > >>> >>> > > > > > >>>>>>>>>> corresponds > >>> >>> > > > > > >>>>>>>>>>>> to a database, and the state of each > operator > >>> is > >>> >>> > > > > > >>>> represented > >>> >>> > > > > > >>>>> as > >>> >>> > > > > > >>>>>>>>>>> individual > >>> >>> > > > > > >>>>>>>>>>>> tables. The rationale behind this design is: > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> *State Diversity*: An operator may involve > >>> >>> multiple > >>> >>> > > types > >>> >>> > > > > > >>>> of > >>> >>> > > > > > >>>>>>>> states. > >>> >>> > > > > > >>>>>>>>>> For > >>> >>> > > > > > >>>>>>>>>>>> example, in our VVR design, a "multi-join" > >>> >>> operator > >>> >>> > uses > >>> >>> > > > > > >>>> keyed > >>> >>> > > > > > >>>>>>>> states > >>> >>> > > > > > >>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>> two input streams and a broadcast state for > >>> the > >>> >>> third > >>> >>> > > > > > >>>> stream. > >>> >>> > > > > > >>>>>>> This > >>> >>> > > > > > >>>>>>>>>> makes > >>> >>> > > > > > >>>>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>> challenging to represent all states of an > >>> operator > >>> >>> > > > > > >> within a > >>> >>> > > > > > >>>>>>> single > >>> >>> > > > > > >>>>>>>>>> table. > >>> >>> > > > > > >>>>>>>>>>>> *Scalability*: Internally, an operator might > >>> have > >>> >>> > > > > > >> multiple > >>> >>> > > > > > >>>>> keyed > >>> >>> > > > > > >>>>>>>>> states > >>> >>> > > > > > >>>>>>>>>>>> (e.g., value state and list state). However, > >>> large > >>> >>> > list > >>> >>> > > > > > >>>> states > >>> >>> > > > > > >>>>>>> may > >>> >>> > > > > > >>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>> fit > >>> >>> > > > > > >>>>>>>>>>>> entirely in memory. To address this, we > >>> recommend > >>> >>> > > > > > >>>> implementing > >>> >>> > > > > > >>>>>>> each > >>> >>> > > > > > >>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>> as a separate table. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> To resolve the loosely coupled relationships > >>> >>> between > >>> >>> > > > > > >>>> operator > >>> >>> > > > > > >>>>>>>> states, > >>> >>> > > > > > >>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>> propose embedding predefined views within > the > >>> >>> catalog. > >>> >>> > > > > > >>>> These > >>> >>> > > > > > >>>>>>> views > >>> >>> > > > > > >>>>>>>>>>> simplify > >>> >>> > > > > > >>>>>>>>>>>> user understanding of operator > >>> implementations and > >>> >>> > > > > > >> provide > >>> >>> > > > > > >>>> a > >>> >>> > > > > > >>>>>>> more > >>> >>> > > > > > >>>>>>>>>>> intuitive > >>> >>> > > > > > >>>>>>>>>>>> perspective. For instance, a join operator > may > >>> >>> have > >>> >>> > > > > > >>>> multiple > >>> >>> > > > > > >>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>> implementations (depending on whether the > >>> join key > >>> >>> > > > > > >> includes > >>> >>> > > > > > >>>>>>> unique > >>> >>> > > > > > >>>>>>>>>>>> attributes), but users primarily care about > >>> the > >>> >>> data > >>> >>> > > > > > >>>>> associated > >>> >>> > > > > > >>>>>>>> with > >>> >>> > > > > > >>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>>> specific join key across input streams. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> Returning to the one-to-one mapping between > >>> >>> savepoints > >>> >>> > > > > > >> and > >>> >>> > > > > > >>>>>>>> catalogs, > >>> >>> > > > > > >>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>> aim > >>> >>> > > > > > >>>>>>>>>>>> to manage multiple user state catalogs > >>> through a > >>> >>> > catalog > >>> >>> > > > > > >>>>> store. > >>> >>> > > > > > >>>>>>>> When > >>> >>> > > > > > >>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>> user > >>> >>> > > > > > >>>>>>>>>>>> triggers a savepoint for a job on the > >>> platform: > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> 1. The platform sends a REST request to the > >>> >>> > JobManager. > >>> >>> > > > > > >>>>>>>>>>>> 2. Simultaneously, it registers a new state > >>> >>> catalog in > >>> >>> > > > > > >> the > >>> >>> > > > > > >>>>>>> catalog > >>> >>> > > > > > >>>>>>>>>> store, > >>> >>> > > > > > >>>>>>>>>>>> enabling immediate analysis of state data on > >>> the > >>> >>> > > > > > >> platform. > >>> >>> > > > > > >>>>>>>>>>>> 3. Deleting a savepoint would also trigger > the > >>> >>> removal > >>> >>> > > of > >>> >>> > > > > > >>>> its > >>> >>> > > > > > >>>>>>>>>> associated > >>> >>> > > > > > >>>>>>>>>>>> catalog. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> This vision assumes that states are > >>> >>> self-describing or > >>> >>> > > > > > >>>> that a > >>> >>> > > > > > >>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>> metaservice is introduced to analyze > savepoint > >>> >>> > > > > > >> structures. > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> How can users create logic to identify > >>> >>> differences > >>> >>> > > > > > >>>> between > >>> >>> > > > > > >>>>>>>> multiple > >>> >>> > > > > > >>>>>>>>>>>> savepoints? > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> Since savepoints and state catalogs are > >>> one-to-one > >>> >>> > > > > > >> mapped, > >>> >>> > > > > > >>>>> users > >>> >>> > > > > > >>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>> query > >>> >>> > > > > > >>>>>>>>>>>> metadata via their respective catalogs. For > >>> >>> example: > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> 1. > >>> >>> > > > > > >>>>> > >>> >>> `savepoint-${id}`.`system`.`metadata_table`.`<operator-name>` > >>> >>> > > > > > >>>>>>>>>> provides > >>> >>> > > > > > >>>>>>>>>>>> operator-specific metadata (e.g., state > size, > >>> >>> type). > >>> >>> > > > > > >>>>>>>>>>>> 2. Comparing metadata tables (e.g., schema > >>> >>> versions, > >>> >>> > > > > > >> state > >>> >>> > > > > > >>>>> entry > >>> >>> > > > > > >>>>>>>>>> counts) > >>> >>> > > > > > >>>>>>>>>>>> across catalogs reveals structural or > >>> quantitative > >>> >>> > > > > > >>>>> differences. > >>> >>> > > > > > >>>>>>>>>>>> 3. For deeper analysis, users could write > SQL > >>> >>> queries > >>> >>> > to > >>> >>> > > > > > >>>>> compare > >>> >>> > > > > > >>>>>>>>>> specific > >>> >>> > > > > > >>>>>>>>>>>> state partitions or leverage the metaservice > >>> to > >>> >>> track > >>> >>> > > > > > >> state > >>> >>> > > > > > >>>>>>>> evolution > >>> >>> > > > > > >>>>>>>>>>>> (e.g., added/removed operators, modified > state > >>> >>> > > > > > >>>>> configurations). > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> If we plan to introduce a state catalog in > the > >>> >>> > future, I > >>> >>> > > > > > >>>> would > >>> >>> > > > > > >>>>>>> lean > >>> >>> > > > > > >>>>>>>>>>> toward > >>> >>> > > > > > >>>>>>>>>>>> using metadata tables. If a utility tool can > >>> >>> address > >>> >>> > the > >>> >>> > > > > > >>>>>>> challenges > >>> >>> > > > > > >>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>> face, could we avoid introducing an > additional > >>> >>> > > connector? > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> Gyula Fóra <gyula.f...@gmail.com> > >>> 于2025年3月17日周一 > >>> >>> > > 20:25写道: > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> Hi All! > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> Without going into too much detail here are > >>> my 2 > >>> >>> > cents > >>> >>> > > > > > >>>>>>> regarding > >>> >>> > > > > > >>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> virtual column / catalog metadata / table > >>> >>> (connector) > >>> >>> > > > > > >>>>>>> discussion > >>> >>> > > > > > >>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> State metadata. > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> State metadata such as the types of states, > >>> their > >>> >>> > > > > > >>>>> properties, > >>> >>> > > > > > >>>>>>>>> names, > >>> >>> > > > > > >>>>>>>>>>>> sizes > >>> >>> > > > > > >>>>>>>>>>>>> etc are all valuable information that can > be > >>> >>> used to > >>> >>> > > > > > >>>> enrich > >>> >>> > > > > > >>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> computations we do on state. > >>> >>> > > > > > >>>>>>>>>>>>> We can either analyze it standalone (such > as > >>> >>> discover > >>> >>> > > > > > >>>>>>> anomalies, > >>> >>> > > > > > >>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>> large > >>> >>> > > > > > >>>>>>>>>>>>> jobs with many states), across multiple > >>> >>> savepoints > >>> >>> > > > > > >>>> (discover > >>> >>> > > > > > >>>>>>> how > >>> >>> > > > > > >>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>> changed over time) or by joining it with > >>> keyed or > >>> >>> > > > > > >>>> non-keyed > >>> >>> > > > > > >>>>>>> state > >>> >>> > > > > > >>>>>>>>>> data > >>> >>> > > > > > >>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>> serve more complex queries on the state. > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> The only solution that seems to serve all > >>> these > >>> >>> > > > > > >> use-cases > >>> >>> > > > > > >>>>> and > >>> >>> > > > > > >>>>>>>>>>>> requirements > >>> >>> > > > > > >>>>>>>>>>>>> in a straightforward and SQL canonical way > >>> is to > >>> >>> > simply > >>> >>> > > > > > >>>>> expose > >>> >>> > > > > > >>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>> metadata as a separate table. This is a > >>> metadata > >>> >>> > table > >>> >>> > > > > > >>>> but > >>> >>> > > > > > >>>>> you > >>> >>> > > > > > >>>>>>>> can > >>> >>> > > > > > >>>>>>>>>> also > >>> >>> > > > > > >>>>>>>>>>>>> think of it as data table, it makes no > >>> practical > >>> >>> > > > > > >>>> difference > >>> >>> > > > > > >>>>>>> here. > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> Once we have a catalog later, the catalog > can > >>> >>> offer > >>> >>> > > > > > >> this > >>> >>> > > > > > >>>>> table > >>> >>> > > > > > >>>>>>>> out > >>> >>> > > > > > >>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> box, the same way databases provide > metadata > >>> >>> tables. > >>> >>> > > > > > >> For > >>> >>> > > > > > >>>>> this > >>> >>> > > > > > >>>>>>> to > >>> >>> > > > > > >>>>>>>>> work > >>> >>> > > > > > >>>>>>>>>>>>> however we need another, simpler connector > >>> that > >>> >>> > creates > >>> >>> > > > > > >>>> this > >>> >>> > > > > > >>>>>>>> table. > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> +1 for state metadata as a separate > >>> >>> connector/table, > >>> >>> > > > > > >>>> instead > >>> >>> > > > > > >>>>>>> of > >>> >>> > > > > > >>>>>>>>>> adding > >>> >>> > > > > > >>>>>>>>>>>>> virtual columns and adhoc catalog metadata > >>> that > >>> >>> is > >>> >>> > hard > >>> >>> > > > > > >>>> to > >>> >>> > > > > > >>>>> use > >>> >>> > > > > > >>>>>>>> in a > >>> >>> > > > > > >>>>>>>>>>> large > >>> >>> > > > > > >>>>>>>>>>>>> number of queries. > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> Cheers, > >>> >>> > > > > > >>>>>>>>>>>>> Gyula > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> On Mon, Mar 17, 2025 at 12:44 PM Gabor > >>> Somogyi < > >>> >>> > > > > > >>>>>>>>>>>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> I’m planning on adding this, and we may > >>> >>> collaborate > >>> >>> > > > > > >>>> on > >>> >>> > > > > > >>>>> it > >>> >>> > > > > > >>>>>>> in > >>> >>> > > > > > >>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> future. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> +1 on this, just ping me. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> After some code digging and POC all I can > >>> say > >>> >>> that > >>> >>> > > > > > >> with > >>> >>> > > > > > >>>>>>> heavy > >>> >>> > > > > > >>>>>>>>>> effort > >>> >>> > > > > > >>>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>> maybe add such changes that we're able to > >>> show > >>> >>> > > > > > >> metadata > >>> >>> > > > > > >>>>> of a > >>> >>> > > > > > >>>>>>>>>>> savepoint > >>> >>> > > > > > >>>>>>>>>>>>> from > >>> >>> > > > > > >>>>>>>>>>>>>> catalog. > >>> >>> > > > > > >>>>>>>>>>>>>> I'm not against that but from user > >>> perspective > >>> >>> this > >>> >>> > > > > > >> has > >>> >>> > > > > > >>>>>>> limited > >>> >>> > > > > > >>>>>>>>>>> value, > >>> >>> > > > > > >>>>>>>>>>>>> let > >>> >>> > > > > > >>>>>>>>>>>>>> me explain why. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> From high level perspective I see the > >>> following > >>> >>> > > > > > >> which I > >>> >>> > > > > > >>>>> see > >>> >>> > > > > > >>>>>>>>>> agreement > >>> >>> > > > > > >>>>>>>>>>>> on: > >>> >>> > > > > > >>>>>>>>>>>>>> * We should have a catalog which is > >>> >>> representing one > >>> >>> > > > > > >> or > >>> >>> > > > > > >>>>> more > >>> >>> > > > > > >>>>>>>> jobs > >>> >>> > > > > > >>>>>>>>>>>>> savepoint > >>> >>> > > > > > >>>>>>>>>>>>>> data set (future plan) > >>> >>> > > > > > >>>>>>>>>>>>>> * Savepoints should be able to be > >>> registered in > >>> >>> the > >>> >>> > > > > > >>>>> catalog > >>> >>> > > > > > >>>>>>>> which > >>> >>> > > > > > >>>>>>>>>> are > >>> >>> > > > > > >>>>>>>>>>>>> then > >>> >>> > > > > > >>>>>>>>>>>>>> databases (future plan) > >>> >>> > > > > > >>>>>>>>>>>>>> * There must be a possiblity to create > >>> tables > >>> >>> from > >>> >>> > > > > > >>>>> databases > >>> >>> > > > > > >>>>>>>>> where > >>> >>> > > > > > >>>>>>>>>>>> users > >>> >>> > > > > > >>>>>>>>>>>>>> can read state data (exists already) > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> In terms of metadata, If I understand > >>> correctly > >>> >>> then > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>> suggested > >>> >>> > > > > > >>>>>>>>>>>>> approach > >>> >>> > > > > > >>>>>>>>>>>>>> would be to access > >>> >>> > > > > > >>>>>>>>>>>>>> it from the catalog describe command, > right? > >>> >>> Adding > >>> >>> > > > > > >>>> that > >>> >>> > > > > > >>>>>>> info > >>> >>> > > > > > >>>>>>>>> when > >>> >>> > > > > > >>>>>>>>>>>>> specific > >>> >>> > > > > > >>>>>>>>>>>>>> database describe command > >>> >>> > > > > > >>>>>>>>>>>>>> is executed could be done. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> The question is for instance how can users > >>> >>> create > >>> >>> > > > > > >> such > >>> >>> > > > > > >>>> a > >>> >>> > > > > > >>>>>>> logic > >>> >>> > > > > > >>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>> tells > >>> >>> > > > > > >>>>>>>>>>>>>> them what is > >>> >>> > > > > > >>>>>>>>>>>>>> the difference between multiple > savepoints? > >>> >>> > > > > > >>>>>>>>>>>>>> Just to give some examples: > >>> >>> > > > > > >>>>>>>>>>>>>> * per operator size changes between > >>> savepoints > >>> >>> > > > > > >>>>>>>>>>>>>> * show values from operator data where > state > >>> >>> size > >>> >>> > > > > > >>>> reaches > >>> >>> > > > > > >>>>> a > >>> >>> > > > > > >>>>>>>>>> boundary > >>> >>> > > > > > >>>>>>>>>>>>>> * in general "find which checkpoint ruined > >>> >>> things" > >>> >>> > is > >>> >>> > > > > > >>>>> quite > >>> >>> > > > > > >>>>>>>>> common > >>> >>> > > > > > >>>>>>>>>>>>> pattern > >>> >>> > > > > > >>>>>>>>>>>>>> What I would like to highlight here is > that > >>> from > >>> >>> > > > > > >> Flink > >>> >>> > > > > > >>>>>>> point of > >>> >>> > > > > > >>>>>>>>>> view > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> metadata can be > >>> >>> > > > > > >>>>>>>>>>>>>> considered as a static side output > >>> information > >>> >>> but > >>> >>> > > > > > >> for > >>> >>> > > > > > >>>>> users > >>> >>> > > > > > >>>>>>>>> these > >>> >>> > > > > > >>>>>>>>>>>> values > >>> >>> > > > > > >>>>>>>>>>>>>> are actual real data > >>> >>> > > > > > >>>>>>>>>>>>>> where logic is planned to build around. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> The metadata is more like one-time > >>> information > >>> >>> > > > > > >>>> instead > >>> >>> > > > > > >>>>> of > >>> >>> > > > > > >>>>>>> a > >>> >>> > > > > > >>>>>>>>>>> streaming > >>> >>> > > > > > >>>>>>>>>>>>>> data that changes all > >>> >>> > > > > > >>>>>>>>>>>>>> the time, so a single connector seems to > be > >>> an > >>> >>> > > > > > >>>> overkill. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> State data is also static within a > >>> savepoint and > >>> >>> > > > > > >> that's > >>> >>> > > > > > >>>>> the > >>> >>> > > > > > >>>>>>>>> reason > >>> >>> > > > > > >>>>>>>>>>> why > >>> >>> > > > > > >>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> state processor API is working in batch > >>> mode. > >>> >>> > > > > > >>>>>>>>>>>>>> When we handle multiple checkpoints in a > >>> >>> streaming > >>> >>> > > > > > >>>> fashion > >>> >>> > > > > > >>>>>>> then > >>> >>> > > > > > >>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>> viewed from another angle. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> We can come up with more lightweight > >>> solution > >>> >>> other > >>> >>> > > > > > >>>> than a > >>> >>> > > > > > >>>>>>> new > >>> >>> > > > > > >>>>>>>>>>>> connector > >>> >>> > > > > > >>>>>>>>>>>>>> but enforcing users to parse the catalog > >>> >>> > > > > > >>>>>>>>>>>>>> describe command output in order to > compare > >>> >>> multiple > >>> >>> > > > > > >>>>>>> savepoints > >>> >>> > > > > > >>>>>>>>>>> doesn't > >>> >>> > > > > > >>>>>>>>>>>>>> sound smooth user experience. > >>> >>> > > > > > >>>>>>>>>>>>>> Honestly I've no other idea how exposing > >>> >>> metadata as > >>> >>> > > > > > >>>> real > >>> >>> > > > > > >>>>>>> user > >>> >>> > > > > > >>>>>>>>> data > >>> >>> > > > > > >>>>>>>>>>> so > >>> >>> > > > > > >>>>>>>>>>>>>> waiting on other approaches. > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> On Thu, Mar 13, 2025 at 2:44 AM Shengkai > >>> Fang < > >>> >>> > > > > > >>>>>>>> fskm...@gmail.com > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> Looking forward to hearing the good news! > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> Gabor Somogyi <gabor.g.somo...@gmail.com > > > >>> >>> > > > > > >>>> 于2025年3月12日周三 > >>> >>> > > > > > >>>>>>>>> 22:24写道: > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> Thanks for both the valuable input! > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> Let me take a closer look at the > >>> suggestions, > >>> >>> > > > > > >> like > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>> Catalog > >>> >>> > > > > > >>>>>>>>>>>>>>> capabilities > >>> >>> > > > > > >>>>>>>>>>>>>>>> and possibility of embedding > >>> TypeInformation > >>> >>> or > >>> >>> > > > > > >>>>>>>>>>>>>>>> StateDescriptor metadata directly into > >>> the raw > >>> >>> > > > > > >>>> state > >>> >>> > > > > > >>>>>>>> files... > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 8:17 AM Shengkai > >>> Fang > >>> >>> < > >>> >>> > > > > > >>>>>>>>>> fskm...@gmail.com > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Thanks for Zakelly's clarification. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> +1 to delay the discussion about this. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> I’d like to share my perspective on the > >>> State > >>> >>> > > > > > >>>>> Catalog > >>> >>> > > > > > >>>>>>>>>> proposal. > >>> >>> > > > > > >>>>>>>>>>>>> While > >>> >>> > > > > > >>>>>>>>>>>>>>>>> introducing this capability is > >>> beneficial, > >>> >>> > > > > > >> there > >>> >>> > > > > > >>>> is > >>> >>> > > > > > >>>>> a > >>> >>> > > > > > >>>>>>>>>> blocker: > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>> current > >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateBackend architecture does not > permit > >>> >>> > > > > > >>>> operators > >>> >>> > > > > > >>>>> to > >>> >>> > > > > > >>>>>>>>> encode > >>> >>> > > > > > >>>>>>>>>>>>>>>>> TypeInformation into the state—it only > >>> >>> > > > > > >> preserves > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>>>> Serializer. > >>> >>> > > > > > >>>>>>>>>>>>> This > >>> >>> > > > > > >>>>>>>>>>>>>>>>> limitation creates an asymmetry, as > >>> operators > >>> >>> > > > > > >>>> alone > >>> >>> > > > > > >>>>>>>> retain > >>> >>> > > > > > >>>>>>>>>>>>> knowledge > >>> >>> > > > > > >>>>>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>> data structure’s schema. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> To address this, I suggest allowing > >>> operators > >>> >>> > > > > > >> to > >>> >>> > > > > > >>>>> embed > >>> >>> > > > > > >>>>>>>>>>>>>> TypeInformation > >>> >>> > > > > > >>>>>>>>>>>>>>> or > >>> >>> > > > > > >>>>>>>>>>>>>>>>> StateDescriptor metadata directly into > >>> the > >>> >>> raw > >>> >>> > > > > > >>>> state > >>> >>> > > > > > >>>>>>>> files. > >>> >>> > > > > > >>>>>>>>>>> Such > >>> >>> > > > > > >>>>>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>>>>>> design > >>> >>> > > > > > >>>>>>>>>>>>>>>>> would enable the Catalog to: > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 1. Parse state files and > programmatically > >>> >>> > > > > > >> derive > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>> schema > >>> >>> > > > > > >>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>> structural > >>> >>> > > > > > >>>>>>>>>>>>>>>>> guarantees for each state. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 2. Leverage existing Flink Table > >>> utilities, > >>> >>> > > > > > >> such > >>> >>> > > > > > >>>> as > >>> >>> > > > > > >>>>>>>>>>>>>>>>> LegacyTypeInfoDataTypeConverter (in > >>> >>> > > > > > >>>>>>>>>>>>>>> org.apache.flink.table.types.utils), > >>> >>> > > > > > >>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>> bridge TypeInformation and DataType > >>> >>> > > > > > >> conversions. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> If we can not store the TypeInformation > >>> or > >>> >>> > > > > > >>>>>>>> StateDescriptor > >>> >>> > > > > > >>>>>>>>>> into > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> raw > >>> >>> > > > > > >>>>>>>>>>>>>>>>> state files, I am +1 for this FLIP to > use > >>> >>> > > > > > >>>> metadata > >>> >>> > > > > > >>>>>>> column > >>> >>> > > > > > >>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>> retrieve > >>> >>> > > > > > >>>>>>>>>>>>>>>>> information. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> Zakelly Lan <zakelly....@gmail.com> > >>> >>> > > > > > >>>> 于2025年3月12日周三 > >>> >>> > > > > > >>>>>>>> 12:43写道: > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Hi Gabor and Shengkai, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Thanks for sharing your thoughts! This > >>> is a > >>> >>> > > > > > >>>> long > >>> >>> > > > > > >>>>>>>>> discussion > >>> >>> > > > > > >>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>> sorry > >>> >>> > > > > > >>>>>>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the late reply (I'm busy catching up > >>> with > >>> >>> > > > > > >>>> release > >>> >>> > > > > > >>>>>>> 2.0 > >>> >>> > > > > > >>>>>>>>> these > >>> >>> > > > > > >>>>>>>>>>>>> days). > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Let me first clarify your thoughts to > >>> ensure > >>> >>> > > > > > >> I > >>> >>> > > > > > >>>>>>>> understand > >>> >>> > > > > > >>>>>>>>>>>>>> correctly. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> IIUC, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> there is no persistent configuration > for > >>> >>> > > > > > >> state > >>> >>> > > > > > >>>> TTL > >>> >>> > > > > > >>>>>>> in > >>> >>> > > > > > >>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> checkpoint. > >>> >>> > > > > > >>>>>>>>>>>>>>>>> While > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you can infer that TTL is enabled by > >>> reading > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>>> serializer, > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>> checkpoint > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> itself only stores the last access > time > >>> for > >>> >>> > > > > > >>>> each > >>> >>> > > > > > >>>>>>> value. > >>> >>> > > > > > >>>>>>>>> So > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> only > >>> >>> > > > > > >>>>>>>>>>>>>>>> thing > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> we can show is the last access time > for > >>> each > >>> >>> > > > > > >>>>> value. > >>> >>> > > > > > >>>>>>> But > >>> >>> > > > > > >>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>>>>>>> required > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> for all state backends to store this, > as > >>> >>> they > >>> >>> > > > > > >>>> may > >>> >>> > > > > > >>>>>>>>> directly > >>> >>> > > > > > >>>>>>>>>>>> store > >>> >>> > > > > > >>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> expired time. This will also increase > >>> the > >>> >>> > > > > > >>>>>>> difficulty of > >>> >>> > > > > > >>>>>>>>>>>>>>> implementation > >>> >>> > > > > > >>>>>>>>>>>>>>>> & > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> maintenance. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> This once again reiterates the > >>> importance of > >>> >>> > > > > > >>>>> unified > >>> >>> > > > > > >>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> checkpoints. I’m planning on adding > >>> this, > >>> >>> and > >>> >>> > > > > > >>>> we > >>> >>> > > > > > >>>>> may > >>> >>> > > > > > >>>>>>>>>>>> collaborate > >>> >>> > > > > > >>>>>>>>>>>>> on > >>> >>> > > > > > >>>>>>>>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the future. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata Column > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I'm not in favor of adding a new > >>> connector > >>> >>> > > > > > >> for > >>> >>> > > > > > >>>>>>>> metadata. > >>> >>> > > > > > >>>>>>>>>> The > >>> >>> > > > > > >>>>>>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> more like one-time information instead > >>> of a > >>> >>> > > > > > >>>>>>> streaming > >>> >>> > > > > > >>>>>>>>> data > >>> >>> > > > > > >>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>> changes > >>> >>> > > > > > >>>>>>>>>>>>>>>>> all > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the time, so a single connector seems > >>> to be > >>> >>> > > > > > >> an > >>> >>> > > > > > >>>>>>>> overkill. > >>> >>> > > > > > >>>>>>>>> It > >>> >>> > > > > > >>>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>>>>> easy > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> withdraw a connector if we have a > better > >>> >>> > > > > > >>>> solution > >>> >>> > > > > > >>>>> in > >>> >>> > > > > > >>>>>>>>>> future. > >>> >>> > > > > > >>>>>>>>>>>> I'm > >>> >>> > > > > > >>>>>>>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> familiar with current Catalog > >>> capabilities, > >>> >>> > > > > > >>>> and if > >>> >>> > > > > > >>>>>>> it > >>> >>> > > > > > >>>>>>>>> could > >>> >>> > > > > > >>>>>>>>>>>>> extract > >>> >>> > > > > > >>>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> show some operator-level information > >>> from > >>> >>> > > > > > >>>>> savepoint, > >>> >>> > > > > > >>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>> great. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> If the Catalog can't do that, I would > >>> >>> > > > > > >> consider > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>> current > >>> >>> > > > > > >>>>>>>>>>> FLIP > >>> >>> > > > > > >>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>> be a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> compromise solution. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> And if we have that unified metadata > for > >>> >>> > > > > > >>>>>>>>>> checkpoint/savepoint > >>> >>> > > > > > >>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>> future, > >>> >>> > > > > > >>>>>>>>>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> may directly register savepoint in > >>> catalog, > >>> >>> > > > > > >> and > >>> >>> > > > > > >>>>>>> create > >>> >>> > > > > > >>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>> source > >>> >>> > > > > > >>>>>>>>>>>>>>> without > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> specifying complex columns, as well as > >>> >>> > > > > > >> describe > >>> >>> > > > > > >>>>> the > >>> >>> > > > > > >>>>>>>>>> savepoint > >>> >>> > > > > > >>>>>>>>>>>>>> catalog > >>> >>> > > > > > >>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> get the metadata. That's a good > >>> solution in > >>> >>> > > > > > >> my > >>> >>> > > > > > >>>>> mind. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> Zakelly > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> On Wed, Mar 12, 2025 at 10:35 AM > >>> Shengkai > >>> >>> > > > > > >> Fang > >>> >>> > > > > > >>>> < > >>> >>> > > > > > >>>>>>>>>>>>> fskm...@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Hi Gabor, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with > >>> >>> > > > > > >>>>>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> I would argue against introducing a > new > >>> >>> > > > > > >>>>> connector > >>> >>> > > > > > >>>>>>>> type > >>> >>> > > > > > >>>>>>>>>>> named > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata, as the existing > >>> Catalog > >>> >>> > > > > > >>>>>>> mechanism > >>> >>> > > > > > >>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>> inherently > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> provide the necessary connector > factory > >>> >>> > > > > > >>>>>>> capabilities. > >>> >>> > > > > > >>>>>>>>>> I’ve > >>> >>> > > > > > >>>>>>>>>>>>>> detailed > >>> >>> > > > > > >>>>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposal in branch[1]. Please take a > >>> moment > >>> >>> > > > > > >>>> to > >>> >>> > > > > > >>>>>>> review > >>> >>> > > > > > >>>>>>>>> it. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> If we introduce a connector named > >>> >>> > > > > > >>>>>>>> `savepoint-metadata`, > >>> >>> > > > > > >>>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>>> means > >>> >>> > > > > > >>>>>>>>>>>>>>> user > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> create a temporary table with > connector > >>> >>> > > > > > >>>>>>>>>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> connector needs to check whether > table > >>> >>> > > > > > >>>> schema is > >>> >>> > > > > > >>>>>>> same > >>> >>> > > > > > >>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> schema > >>> >>> > > > > > >>>>>>>>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> proposed in the FLIP. On the other > >>> hand, > >>> >>> > > > > > >> it's > >>> >>> > > > > > >>>>> not > >>> >>> > > > > > >>>>>>>> easy > >>> >>> > > > > > >>>>>>>>>> work > >>> >>> > > > > > >>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>>> others > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users a metadata table with same > >>> schema. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://github.com/apache/flink/compare/master...fsk119:flink:state-metadata?expand=1#diff-712a7bc92fe46c405fb0e61b475bb2a005cb7a72bab7df28bbb92744bcb5f465R63 > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> Gabor Somogyi < > >>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>>>>>> 于2025年3月11日周二 > >>> >>> > > > > > >>>>>>>>>>>>> 16:56写道: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Hi Shengkai, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> From directional perspective I agree > >>> your > >>> >>> > > > > > >>>> idea > >>> >>> > > > > > >>>>>>> how > >>> >>> > > > > > >>>>>>>> it > >>> >>> > > > > > >>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> implemented. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Previously I've mentioned that TTL > >>> >>> > > > > > >>>> information > >>> >>> > > > > > >>>>>>> is > >>> >>> > > > > > >>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>> exposed > >>> >>> > > > > > >>>>>>>>>>>>>> on > >>> >>> > > > > > >>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> processor API (which the SQL state > >>> >>> > > > > > >>>> connector > >>> >>> > > > > > >>>>>>> uses > >>> >>> > > > > > >>>>>>>> to > >>> >>> > > > > > >>>>>>>>>> read > >>> >>> > > > > > >>>>>>>>>>>>> data) > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> and unless somebody show me the > >>> opposite > >>> >>> > > > > > >>>> this > >>> >>> > > > > > >>>>>>> FLIP > >>> >>> > > > > > >>>>>>>> is > >>> >>> > > > > > >>>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>>> going > >>> >>> > > > > > >>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> address > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> this to avoid feature creep. Our > users > >>> >>> > > > > > >> are > >>> >>> > > > > > >>>>> also > >>> >>> > > > > > >>>>>>>>>>> interested > >>> >>> > > > > > >>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>> TTL > >>> >>> > > > > > >>>>>>>>>>>>>>>> so > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sooner or later we're going to > expose > >>> it, > >>> >>> > > > > > >>>> this > >>> >>> > > > > > >>>>>>> is > >>> >>> > > > > > >>>>>>>>>> matter > >>> >>> > > > > > >>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>>> scheduling. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with > >>> >>> > > > > > >>>>>>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Not sure I understand your point at > >>> all > >>> >>> > > > > > >>>>> related > >>> >>> > > > > > >>>>>>>>>>>> StateCatalog. > >>> >>> > > > > > >>>>>>>>>>>>>>> First > >>> >>> > > > > > >>>>>>>>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> all > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I can't agree more that StateCatalog > >>> is > >>> >>> > > > > > >>>> needed > >>> >>> > > > > > >>>>>>> and > >>> >>> > > > > > >>>>>>>>> is a > >>> >>> > > > > > >>>>>>>>>>>>> planned > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> building > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> block in an upcoming > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> FLIP but not sure how can it help > >>> now? No > >>> >>> > > > > > >>>>> matter > >>> >>> > > > > > >>>>>>>>> what, > >>> >>> > > > > > >>>>>>>>>>> your > >>> >>> > > > > > >>>>>>>>>>>>>>>> knowledge > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> essential when we add StateCatalog. > >>> Let > >>> >>> > > > > > >> me > >>> >>> > > > > > >>>>>>> expose > >>> >>> > > > > > >>>>>>>> my > >>> >>> > > > > > >>>>>>>>>>>>>>> understanding > >>> >>> > > > > > >>>>>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> area: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * First we need create table > >>> statements > >>> >>> > > > > > >> to > >>> >>> > > > > > >>>>>>> access > >>> >>> > > > > > >>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>> data > >>> >>> > > > > > >>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * When we have that then we can add > >>> >>> > > > > > >>>>> StateCatalog > >>> >>> > > > > > >>>>>>>>> which > >>> >>> > > > > > >>>>>>>>>>>> could > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> potentially > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> ease the life of users by for ex. > >>> giving > >>> >>> > > > > > >>>>>>>>> off-the-shelf > >>> >>> > > > > > >>>>>>>>>>>> tables > >>> >>> > > > > > >>>>>>>>>>>>>>>> without > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> sweating with create table > statements > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> User expectations: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See state data (this is fulfilled > >>> with > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>> existing > >>> >>> > > > > > >>>>>>>>>>>>>> connector) > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about state data like > >>> TTL > >>> >>> > > > > > >>>> (this > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>> added > >>> >>> > > > > > >>>>>>>>>>>>> as > >>> >>> > > > > > >>>>>>>>>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> column as you suggested since it > >>> belongs > >>> >>> > > > > > >> to > >>> >>> > > > > > >>>>> the > >>> >>> > > > > > >>>>>>>> data) > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> * See metadata about operators (this > >>> can > >>> >>> > > > > > >> be > >>> >>> > > > > > >>>>>>> added > >>> >>> > > > > > >>>>>>>>> from > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> savepoint-metadata) > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> Important to highlight that state > data > >>> >>> > > > > > >>>> table > >>> >>> > > > > > >>>>>>> format > >>> >>> > > > > > >>>>>>>>>>> differs > >>> >>> > > > > > >>>>>>>>>>>>>> from > >>> >>> > > > > > >>>>>>>>>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata table format. Namely one > >>> table > >>> >>> > > > > > >> has > >>> >>> > > > > > >>>>> rows > >>> >>> > > > > > >>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>>> values > >>> >>> > > > > > >>>>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> another has rows for operators, > right? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I think that's the reason why you've > >>> >>> > > > > > >>>>> pinpointed > >>> >>> > > > > > >>>>>>> out > >>> >>> > > > > > >>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>> suggested > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> metadata columns are somewhat > clunky. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> As a conclusion I agree to add > >>> >>> > > > > > >>>>> ${state-name}_ttl > >>> >>> > > > > > >>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>> column > >>> >>> > > > > > >>>>>>>>>>>>>>>>> later > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> on > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> since it belongs to the state value > >>> and > >>> >>> > > > > > >>>>> adding a > >>> >>> > > > > > >>>>>>>> new > >>> >>> > > > > > >>>>>>>>>>> table > >>> >>> > > > > > >>>>>>>>>>>>> type > >>> >>> > > > > > >>>>>>>>>>>>>>>> (like > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> you > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> suggested similar to PG [1]) > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for metadata. Please see how Spark > >>> does > >>> >>> > > > > > >>>> that > >>> >>> > > > > > >>>>> too > >>> >>> > > > > > >>>>>>>> [2]. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> If you have better approach then > >>> please > >>> >>> > > > > > >>>>>>> elaborate > >>> >>> > > > > > >>>>>>>>> with > >>> >>> > > > > > >>>>>>>>>>> more > >>> >>> > > > > > >>>>>>>>>>>>>>> details > >>> >>> > > > > > >>>>>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> help me to understand your point. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB > >>> >>> > > > > > >>>>> savepoints > >>> >>> > > > > > >>>>>>>> that > >>> >>> > > > > > >>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>> number > >>> >>> > > > > > >>>>>>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>>> keys > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per > key > >>> >>> > > > > > >>>> state > >>> >>> > > > > > >>>>>>>> itself. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> But again, this is a good feature > >>> as-is > >>> >>> > > > > > >>>> and > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>> handled > >>> >>> > > > > > >>>>>>>>>>>>>> in a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> separate > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> jira. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> I've just created > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >> https://issues.apache.org/jira/browse/FLINK-37456. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> https://www.postgresql.org/docs/current/view-pg-tables.html > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [2] > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://www.databricks.com/blog/announcing-state-reader-api-new-statestore-data-source > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> On Tue, Mar 11, 2025 at 3:55 AM > >>> Shengkai > >>> >>> > > > > > >>>> Fang > >>> >>> > > > > > >>>>> < > >>> >>> > > > > > >>>>>>>>>>>>>> fskm...@gmail.com > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your > response. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Thank you for addressing the > >>> >>> > > > > > >> limitations > >>> >>> > > > > > >>>>> here. > >>> >>> > > > > > >>>>>>>>>>> However, I > >>> >>> > > > > > >>>>>>>>>>>>>>> believe > >>> >>> > > > > > >>>>>>>>>>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> be beneficial to further clarify > the > >>> >>> > > > > > >> API > >>> >>> > > > > > >>>> in > >>> >>> > > > > > >>>>>>> this > >>> >>> > > > > > >>>>>>>>> FLIP > >>> >>> > > > > > >>>>>>>>>>>>>> regarding > >>> >>> > > > > > >>>>>>>>>>>>>>>> how > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> users > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> can specify the TTL column. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> One potential approach that comes > to > >>> >>> > > > > > >>>> mind is > >>> >>> > > > > > >>>>>>>> using > >>> >>> > > > > > >>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>>>>>> standardized > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> naming > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> convention such as > ${state-name}_ttl > >>> >>> > > > > > >> for > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>> column > >>> >>> > > > > > >>>>>>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> defines > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the TTL value. In terms of > >>> >>> > > > > > >>>> implementation, > >>> >>> > > > > > >>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>> listReadableMetadata > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> function could: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Read the table’s columns and > >>> >>> > > > > > >>>>> configuration, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Extract all defined state names, > >>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 3. Return a structured list of > >>> metadata > >>> >>> > > > > > >>>>>>> entries > >>> >>> > > > > > >>>>>>>>>>> formatted > >>> >>> > > > > > >>>>>>>>>>>>> as > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> ${state-name}_ttl. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> WDYT? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. Adding a new connector with > >>> >>> > > > > > >>>>>>>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Introducing a new connector type at > >>> >>> > > > > > >> this > >>> >>> > > > > > >>>>> stage > >>> >>> > > > > > >>>>>>>> may > >>> >>> > > > > > >>>>>>>>>>>>>>> unnecessarily > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> complicate > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> the system. Given that every table > >>> >>> > > > > > >>>> already > >>> >>> > > > > > >>>>>>>> belongs > >>> >>> > > > > > >>>>>>>>>> to a > >>> >>> > > > > > >>>>>>>>>>>>>>> Catalog, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> which > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> designed to provide a Factory for > >>> >>> > > > > > >>>> building > >>> >>> > > > > > >>>>>>> source > >>> >>> > > > > > >>>>>>>>> or > >>> >>> > > > > > >>>>>>>>>>> sink > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> connectors, I > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> propose integrating a dedicated > >>> >>> > > > > > >>>> StateCatalog > >>> >>> > > > > > >>>>>>>>> instead. > >>> >>> > > > > > >>>>>>>>>>>> This > >>> >>> > > > > > >>>>>>>>>>>>>>>> approach > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> allow us to: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 1. Leverage the Catalog’s existing > >>> >>> > > > > > >>>>>>> capabilities > >>> >>> > > > > > >>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>> manage > >>> >>> > > > > > >>>>>>>>>>>>> TTL > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> (e.g., state names and TTL logic) > >>> >>> > > > > > >> without > >>> >>> > > > > > >>>>>>>>> duplicating > >>> >>> > > > > > >>>>>>>>>>>>>>>>> functionality. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> 2. Provide a unified interface for > >>> >>> > > > > > >>>> connector > >>> >>> > > > > > >>>>>>>>>>>> instantiation > >>> >>> > > > > > >>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handling through the Catalog’s > >>> Factory > >>> >>> > > > > > >>>>>>> pattern. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Would this design decision better > >>> align > >>> >>> > > > > > >>>> with > >>> >>> > > > > > >>>>>>> our > >>> >>> > > > > > >>>>>>>>>>>>>> architecture’s > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> extensibility and reduce > redundancy? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB > >>> >>> > > > > > >>>>>>> savepoints > >>> >>> > > > > > >>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> number > >>> >>> > > > > > >>>>>>>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per > >>> key > >>> >>> > > > > > >>>>> state > >>> >>> > > > > > >>>>>>>>> itself. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature > >>> >>> > > > > > >> as-is > >>> >>> > > > > > >>>>> and > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>> handled > >>> >>> > > > > > >>>>>>>>>>>>>>> in a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> +1 for a separate jira. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> Gabor Somogyi < > >>> >>> > > > > > >> gabor.g.somo...@gmail.com > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>>>>>>>>> 于2025年3月10日周一 > >>> >>> > > > > > >>>>>>>>>>>>>>> 19:05写道: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Shengkai, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Please see my comments inline. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 3, 2025 at 7:07 AM > >>> >>> > > > > > >> Shengkai > >>> >>> > > > > > >>>>>>> Fang < > >>> >>> > > > > > >>>>>>>>>>>>>>>> fskm...@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Hi, Gabor. Thanks for your the > >>> >>> > > > > > >> FLIP. > >>> >>> > > > > > >>>> I > >>> >>> > > > > > >>>>>>> have > >>> >>> > > > > > >>>>>>>>> some > >>> >>> > > > > > >>>>>>>>>>>>>> questions > >>> >>> > > > > > >>>>>>>>>>>>>>>>> about > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> FLIP: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 1. State TTL for Value Columns > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> How can users retrieve the state > >>> >>> > > > > > >> TTL > >>> >>> > > > > > >>>>>>>>>> (Time-to-Live) > >>> >>> > > > > > >>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>> each > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> value > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> column? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> From my understanding of the > >>> >>> > > > > > >> current > >>> >>> > > > > > >>>>>>> design, > >>> >>> > > > > > >>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>> seems > >>> >>> > > > > > >>>>>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> functionality is not supported. > >>> >>> > > > > > >> Could > >>> >>> > > > > > >>>>> you > >>> >>> > > > > > >>>>>>>>> clarify > >>> >>> > > > > > >>>>>>>>>>> if > >>> >>> > > > > > >>>>>>>>>>>>>> there > >>> >>> > > > > > >>>>>>>>>>>>>>>> are > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> plans > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> address this limitation? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Since the state processor API is > not > >>> >>> > > > > > >>>> yet > >>> >>> > > > > > >>>>>>>> exposing > >>> >>> > > > > > >>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>> information > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> would require several steps. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> First, the state processor API > >>> >>> > > > > > >> support > >>> >>> > > > > > >>>>>>> needs to > >>> >>> > > > > > >>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>> added > >>> >>> > > > > > >>>>>>>>>>>>>>> which > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> then > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exposed on the SQL API. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This is definitely a future > >>> >>> > > > > > >> improvement > >>> >>> > > > > > >>>>>>> which > >>> >>> > > > > > >>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>> useful > >>> >>> > > > > > >>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> handled > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> in a separate jira. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 2. Metadata Table vs. Metadata > >>> >>> > > > > > >> Column > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> The metadata information > described > >>> >>> > > > > > >> in > >>> >>> > > > > > >>>>> the > >>> >>> > > > > > >>>>>>>> FLIP > >>> >>> > > > > > >>>>>>>>>>>> appears > >>> >>> > > > > > >>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> intended > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> describe the state files stored > at > >>> >>> > > > > > >> a > >>> >>> > > > > > >>>>>>> specific > >>> >>> > > > > > >>>>>>>>>>>> location. > >>> >>> > > > > > >>>>>>>>>>>>>> To > >>> >>> > > > > > >>>>>>>>>>>>>>>> me, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> concept > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> aligns more closely with system > >>> >>> > > > > > >>>> tables > >>> >>> > > > > > >>>>>>> like > >>> >>> > > > > > >>>>>>>>>>> pg_tables > >>> >>> > > > > > >>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> PostgreSQL > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> or > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> the INFORMATION_SCHEMA in MySQL > >>> >>> > > > > > >> [2]. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Adding a new connector with > >>> >>> > > > > > >>>>>>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>> is a > >>> >>> > > > > > >>>>>>>>>>>>>>>> possibility > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> where > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> can create such functionality. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> I'm not against that, just want to > >>> >>> > > > > > >>>> have a > >>> >>> > > > > > >>>>>>>> common > >>> >>> > > > > > >>>>>>>>>>>>> agreement > >>> >>> > > > > > >>>>>>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>>>> we > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> like to move that direction. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (As a side note not just PG but > >>> Spark > >>> >>> > > > > > >>>> also > >>> >>> > > > > > >>>>>>> has > >>> >>> > > > > > >>>>>>>>>>> similar > >>> >>> > > > > > >>>>>>>>>>>>>>> approach > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> and I > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> basically like the idea). > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would go that direction > >>> >>> > > > > > >> savepoint > >>> >>> > > > > > >>>>>>>> metadata > >>> >>> > > > > > >>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>> reached > >>> >>> > > > > > >>>>>>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> way > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> that one row would represent > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> an operator with it's values > >>> >>> > > > > > >> something > >>> >>> > > > > > >>>>> like > >>> >>> > > > > > >>>>>>>> this: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > ┌─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬─────────┬────────┐ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > │operatorN│operatorU│operatorH│paralleli│maxParall│subtaskSt│coordinat│totalSta│ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ame │id │ash │sm > >>> >>> > > > > > >>>>>>> │elism > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │atesCount│orStateSi│tesSizeI│ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ │ │ │ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │zeInBytes│nBytes │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │Source: │datagen-s│47aee9439│2 > >>> >>> > > > > > >>>>> │128 > >>> >>> > > > > > >>>>>>>>>> │2 > >>> >>> > > > > > >>>>>>>>>>>>>>> │16 > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │546 │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │datagen-s│ource-uid│4d6ea26e2│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ource │ │d544bef0a│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ │ │37bb5 │ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │long-udf-│long-udf-│6ed3f40bf│2 > >>> >>> > > > > > >>>>> │128 > >>> >>> > > > > > >>>>>>>>>> │2 > >>> >>> > > > > > >>>>>>>>>>>>>>> │0 > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> │0 > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │with-mast│with-mast│f3c8dfcdf│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │er-hook │er-hook-u│cb95128a1│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ │id │018f1 │ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │value-pro│value-pro│ca4f5fe9a│2 > >>> >>> > > > > > >>>>> │128 > >>> >>> > > > > > >>>>>>>>>> │2 > >>> >>> > > > > > >>>>>>>>>>>>>>> │0 > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │40726 │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │cess │cess-uid │637b656f0│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ │ │9ea78b3e7│ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ │ │a15b9 │ > >>> >>> > > > > > >>>> │ > >>> >>> > > > > > >>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> │ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > ├─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼─────────┼────────┤ > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This table can then be joined with > >>> >>> > > > > > >> the > >>> >>> > > > > > >>>>>>> actually > >>> >>> > > > > > >>>>>>>>>>>> existing > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> `savepoint` > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> connector created tables based on > >>> UID > >>> >>> > > > > > >>>> hash > >>> >>> > > > > > >>>>>>>> (which > >>> >>> > > > > > >>>>>>>>>> is > >>> >>> > > > > > >>>>>>>>>>>>> unique > >>> >>> > > > > > >>>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> always > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> exists). > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> This would mean that the already > >>> >>> > > > > > >>>> existing > >>> >>> > > > > > >>>>>>> table > >>> >>> > > > > > >>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>> need > >>> >>> > > > > > >>>>>>>>>>>>>>>> only a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> single > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> metadata column which is the UID > >>> >>> > > > > > >> hash. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> WDYT? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> @zakelly, plz share your thoughts > >>> >>> > > > > > >> too. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> If we opt to use metadata > columns, > >>> >>> > > > > > >>>> every > >>> >>> > > > > > >>>>>>>> record > >>> >>> > > > > > >>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>> table > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> would > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> end > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> up > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> having identical values for these > >>> >>> > > > > > >>>>> columns > >>> >>> > > > > > >>>>>>>>> (please > >>> >>> > > > > > >>>>>>>>>>>>> correct > >>> >>> > > > > > >>>>>>>>>>>>>>> me > >>> >>> > > > > > >>>>>>>>>>>>>>>> if > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> I’m > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> mistaken). On the other hand, the > >>> >>> > > > > > >>>> state > >>> >>> > > > > > >>>>>>>>> connector > >>> >>> > > > > > >>>>>>>>>>>>>> requires > >>> >>> > > > > > >>>>>>>>>>>>>>>>> users > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> specify > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> an operator UID or operator UID > >>> >>> > > > > > >> hash, > >>> >>> > > > > > >>>>>>> after > >>> >>> > > > > > >>>>>>>>> which > >>> >>> > > > > > >>>>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>>>>> outputs > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> user-defined > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> values in its records. This > >>> >>> > > > > > >> approach > >>> >>> > > > > > >>>>> feels > >>> >>> > > > > > >>>>>>>>>> somewhat > >>> >>> > > > > > >>>>>>>>>>>>>>> redundant > >>> >>> > > > > > >>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> me. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> If we would add a new > >>> >>> > > > > > >>>> `savepoint-metadata` > >>> >>> > > > > > >>>>>>>>>> connector > >>> >>> > > > > > >>>>>>>>>>>> then > >>> >>> > > > > > >>>>>>>>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> addressed. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> On the other hand UID and UID hash > >>> >>> > > > > > >> are > >>> >>> > > > > > >>>>>>> having > >>> >>> > > > > > >>>>>>>>>>> either-or > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> relationship > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> from > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> config perspective, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> so when a user provides the UID > then > >>> >>> > > > > > >>>>> he/she > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>>> interested > >>> >>> > > > > > >>>>>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> hash > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> for further calculations > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> (the whole Flink internals are > >>> >>> > > > > > >>>> depending > >>> >>> > > > > > >>>>> on > >>> >>> > > > > > >>>>>>> the > >>> >>> > > > > > >>>>>>>>>>> hash). > >>> >>> > > > > > >>>>>>>>>>>>>>> Printing > >>> >>> > > > > > >>>>>>>>>>>>>>>>> out > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> human readable UID > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> is an explicit requirement from > the > >>> >>> > > > > > >>>> user > >>> >>> > > > > > >>>>>>> side > >>> >>> > > > > > >>>>>>>>>> because > >>> >>> > > > > > >>>>>>>>>>>>>> hashes > >>> >>> > > > > > >>>>>>>>>>>>>>>> are > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> not > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> human > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> readable. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> 3. Handling LIST and MAP States > in > >>> >>> > > > > > >>>> the > >>> >>> > > > > > >>>>>>> State > >>> >>> > > > > > >>>>>>>>>>>> Connector > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> I have concerns about how the > >>> >>> > > > > > >> current > >>> >>> > > > > > >>>>>>> design > >>> >>> > > > > > >>>>>>>>>>> handles > >>> >>> > > > > > >>>>>>>>>>>>> LIST > >>> >>> > > > > > >>>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>> MAP > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> states. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Specifically, the state connector > >>> >>> > > > > > >>>> uses > >>> >>> > > > > > >>>>>>> Flink > >>> >>> > > > > > >>>>>>>>>> SQL’s > >>> >>> > > > > > >>>>>>>>>>>> MAP > >>> >>> > > > > > >>>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>> ARRAY > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> types, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> which implies that it attempts to > >>> >>> > > > > > >>>> load > >>> >>> > > > > > >>>>>>> entire > >>> >>> > > > > > >>>>>>>>> MAP > >>> >>> > > > > > >>>>>>>>>>> or > >>> >>> > > > > > >>>>>>>>>>>>> LIST > >>> >>> > > > > > >>>>>>>>>>>>>>>>> states > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> into > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> memory. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> However, in many real-world > >>> >>> > > > > > >>>> scenarios, > >>> >>> > > > > > >>>>>>> these > >>> >>> > > > > > >>>>>>>>>> states > >>> >>> > > > > > >>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>> grow > >>> >>> > > > > > >>>>>>>>>>>>>>>>> very > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> large. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Typically, the state API > addresses > >>> >>> > > > > > >>>> this > >>> >>> > > > > > >>>>> by > >>> >>> > > > > > >>>>>>>>>>> providing > >>> >>> > > > > > >>>>>>>>>>>> an > >>> >>> > > > > > >>>>>>>>>>>>>>>>> iterator > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> traverse elements within the > state > >>> >>> > > > > > >>>>>>>>> incrementally. > >>> >>> > > > > > >>>>>>>>>>> I’m > >>> >>> > > > > > >>>>>>>>>>>>>>> unsure > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> whether > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> I’ve > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> missed something in FLIP-496 or > >>> >>> > > > > > >>>>> FLIP-512, > >>> >>> > > > > > >>>>>>> but > >>> >>> > > > > > >>>>>>>>> it > >>> >>> > > > > > >>>>>>>>>>>> seems > >>> >>> > > > > > >>>>>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> current > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> design might struggle with > >>> >>> > > > > > >>>> scalability > >>> >>> > > > > > >>>>> in > >>> >>> > > > > > >>>>>>>> such > >>> >>> > > > > > >>>>>>>>>>> cases. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> You see it good, the current > >>> >>> > > > > > >>>>> implementation > >>> >>> > > > > > >>>>>>>> keeps > >>> >>> > > > > > >>>>>>>>>>> state > >>> >>> > > > > > >>>>>>>>>>>>>> for a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> single > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> key > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> in > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> memory. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Back in the days we've considered > >>> >>> > > > > > >> this > >>> >>> > > > > > >>>>>>>> potential > >>> >>> > > > > > >>>>>>>>>>> issue > >>> >>> > > > > > >>>>>>>>>>>>> and > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> concluded > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> this is not necessarily > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> needed for the initial version and > >>> >>> > > > > > >> can > >>> >>> > > > > > >>>> be > >>> >>> > > > > > >>>>>>> done > >>> >>> > > > > > >>>>>>>>> as a > >>> >>> > > > > > >>>>>>>>>>>> later > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> improvement. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> Up until now we've seen even in TB > >>> >>> > > > > > >>>>>>> savepoints > >>> >>> > > > > > >>>>>>>>> that > >>> >>> > > > > > >>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>> number > >>> >>> > > > > > >>>>>>>>>>>>>>>> of > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> keys > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> can > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> be extremely huge but not the per > >>> key > >>> >>> > > > > > >>>>> state > >>> >>> > > > > > >>>>>>>>> itself. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> But again, this is a good feature > >>> >>> > > > > > >> as-is > >>> >>> > > > > > >>>>> and > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>>>> handled > >>> >>> > > > > > >>>>>>>>>>>>>>> in a > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> separate > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> jira. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Shengkai > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > https://www.postgresql.org/docs/current/view-pg-tables.html > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> [2] > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://dev.mysql.com/doc/refman/8.4/en/information-schema-tables-table.html > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> Gabor Somogyi < > >>> >>> > > > > > >>>>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>> 于2025年3月3日周一 > >>> >>> > > > > > >>>>>>>>>>>>>>>>> 02:00写道: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> Hi Zakelly, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> In order to shoot for simplicity > >>> >>> > > > > > >>>>>>> `METADATA > >>> >>> > > > > > >>>>>>>>>>> VIRTUAL` > >>> >>> > > > > > >>>>>>>>>>>>> as > >>> >>> > > > > > >>>>>>>>>>>>>>> key > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> words > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> definition is the target. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> When it's not super complex the > >>> >>> > > > > > >>>> latter > >>> >>> > > > > > >>>>>>> can > >>> >>> > > > > > >>>>>>>> be > >>> >>> > > > > > >>>>>>>>>>> added > >>> >>> > > > > > >>>>>>>>>>>>>> too. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Mar 2, 2025 at 3:37 PM > >>> >>> > > > > > >>>> Zakelly > >>> >>> > > > > > >>>>>>> Lan > >>> >>> > > > > > >>>>>>>> < > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> zakelly....@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Gabor, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> +1 for this. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Will the metadata column use > >>> >>> > > > > > >>>>> `METADATA > >>> >>> > > > > > >>>>>>>>>> VIRTUAL` > >>> >>> > > > > > >>>>>>>>>>>> as > >>> >>> > > > > > >>>>>>>>>>>>>> key > >>> >>> > > > > > >>>>>>>>>>>>>>>>> words > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> definition, or `METADATA FROM > >>> >>> > > > > > >> xxx > >>> >>> > > > > > >>>>>>>> VIRTUAL` > >>> >>> > > > > > >>>>>>>>>> for > >>> >>> > > > > > >>>>>>>>>>>>>>> renaming, > >>> >>> > > > > > >>>>>>>>>>>>>>>>> just > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> like > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> the > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Kafka table? > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> Zakelly > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Mar 1, 2025 at 1:31 PM > >>> >>> > > > > > >>>> Gabor > >>> >>> > > > > > >>>>>>>>> Somogyi > >>> >>> > > > > > >>>>>>>>>> < > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> gabor.g.somo...@gmail.com> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi All, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> I'd like to start a > >>> >>> > > > > > >> discussion > >>> >>> > > > > > >>>> of > >>> >>> > > > > > >>>>>>>>> FLIP-512: > >>> >>> > > > > > >>>>>>>>>>> Add > >>> >>> > > > > > >>>>>>>>>>>>>> meta > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> information > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> to > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> SQL > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> state connector [1]. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Feel free to add your > >>> >>> > > > > > >> thoughts > >>> >>> > > > > > >>>> to > >>> >>> > > > > > >>>>>>> make > >>> >>> > > > > > >>>>>>>>> this > >>> >>> > > > > > >>>>>>>>>>>>> feature > >>> >>> > > > > > >>>>>>>>>>>>>>>>> better. > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-512%3A+Add+meta+information+to+SQL+state+connector > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> BR, > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> G > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>>> > >>> >>> > > > > > >>>>>>>>>> > >>> >>> > > > > > >>>>>>>>> > >>> >>> > > > > > >>>>>>>> > >>> >>> > > > > > >>>>>>> > >>> >>> > > > > > >>>>>> > >>> >>> > > > > > >>>>> > >>> >>> > > > > > >>>> > >>> >>> > > > > > >>> > >>> >>> > > > > > >> > >>> >>> > > > > > > >>> >>> > > > > > > >>> >>> > > > > > >>> >>> > > > > >>> >>> > > > >>> >>> > > >>> >>> > >>> >> > >>> > >> >