Thanks for the proposal. I like the FLIP. My ranking:
1. Refresh(ing) / Live Table -> easy to understand and implies the dynamic characteristic 2. Derived Table -> easy to understand. 3. Materialized Table -> sounds like just a table with physical data stored somewhere. 4. Materialized View -> modifying a view directly is a little weird. Thanks, Jiangjie (Becket) Qin On Tue, Apr 9, 2024 at 5:46 AM Lincoln Lee <lincoln.8...@gmail.com> wrote: > Thanks Ron and Timo for your proposal! > > Here is my ranking: > > 1. Derived table -> extend the persistent semantics of derived table in SQL > standard, with a strong association with query, and has industry > precedents > such as Google Looker. > > 2. Live Table -> an alternative for 'dynamic table' > > 3. Materialized Table -> combination of the Materialized View and Table, > but > still a table which accept data changes > > 4. Materialized View -> need to extend understanding of the view to accept > data changes > > The reason for not adding 'Refresh Table' is I don't want to tell the user > to 'refresh a refresh table'. > > > Best, > Lincoln Lee > > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 20:11写道: > > > Hi, Dev > > > > My rankings are: > > > > 1. Derived Table > > 2. Materialized Table > > 3. Live Table > > 4. Materialized View > > > > Best, > > Ron > > > > > > > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 20:07写道: > > > > > Hi, Dev > > > > > > After several rounds of discussion, there is currently no consensus on > > the > > > name of the new concept. Timo has proposed that we decide the name > > through > > > a vote. This is a good solution when there is no clear preference, so > we > > > will adopt this approach. > > > > > > Regarding the name of the new concept, there are currently five > > candidates: > > > 1. Derived Table -> taken by SQL standard > > > 2. Materialized Table -> similar to SQL materialized view but a table > > > 3. Live Table -> similar to dynamic tables > > > 4. Refresh Table -> states what it does > > > 5. Materialized View -> needs to extend the standard to support > modifying > > > data > > > > > > For the above five candidates, everyone can give your rankings based on > > > your preferences. You can choose up to five options or only choose some > > of > > > them. > > > We will use a scoring rule, where the* first rank gets 5 points, second > > > rank gets 4 points, third rank gets 3 points, fourth rank gets 2 > points, > > > and fifth rank gets 1 point*. > > > After the voting closes, I will score all the candidates based on > > > everyone's votes, and the candidate with the highest score will be > chosen > > > as the name for the new concept. > > > > > > The voting will last up to 72 hours and is expected to close this > Friday. > > > I look forward to everyone voting on the name in this thread. Of > course, > > we > > > also welcome new input regarding the name. > > > > > > Best, > > > Ron > > > > > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:49写道: > > > > > >> Hi, Dev > > >> > > >> Sorry for my previous statement was not quite accurate. We will hold a > > >> vote for the name within this thread. > > >> > > >> Best, > > >> Ron > > >> > > >> > > >> Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:29写道: > > >> > > >>> Hi, Timo > > >>> > > >>> Thanks for your reply. > > >>> > > >>> I agree with you that sometimes naming is more difficult. When no one > > >>> has a clear preference, voting on the name is a good solution, so > I'll > > send > > >>> a separate email for the vote, clarify the rules for the vote, then > let > > >>> everyone vote. > > >>> > > >>> One other point to confirm, in your ranking there is an option for > > >>> Materialized View, does it stand for the UPDATING Materialized View > > that > > >>> you mentioned earlier in the discussion? If using Materialized View I > > think > > >>> it is needed to extend it. > > >>> > > >>> Best, > > >>> Ron > > >>> > > >>> Timo Walther <twal...@apache.org> 于2024年4月9日周二 17:20写道: > > >>> > > >>>> Hi Ron, > > >>>> > > >>>> yes naming is hard. But it will have large impact on trainings, > > >>>> presentations, and the mental model of users. Maybe the easiest is > to > > >>>> collect ranking by everyone with some short justification: > > >>>> > > >>>> > > >>>> My ranking (from good to not so good): > > >>>> > > >>>> 1. Refresh Table -> states what it does > > >>>> 2. Materialized Table -> similar to SQL materialized view but a > table > > >>>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic > > >>>> tables? > > >>>> 4. Materialized View -> a bit broader than standard but still very > > >>>> similar > > >>>> 5. Derived table -> taken by standard > > >>>> > > >>>> Regards, > > >>>> Timo > > >>>> > > >>>> > > >>>> > > >>>> On 07.04.24 11:34, Ron liu wrote: > > >>>> > Hi, Dev > > >>>> > > > >>>> > This is a summary letter. After several rounds of discussion, > there > > >>>> is a > > >>>> > strong consensus about the FLIP proposal and the issues it aims to > > >>>> address. > > >>>> > The current point of disagreement is the naming of the new > concept. > > I > > >>>> have > > >>>> > summarized the candidates as follows: > > >>>> > > > >>>> > 1. Derived Table (Inspired by Google Lookers) > > >>>> > - Pros: Google Lookers has introduced this concept, which is > > >>>> designed > > >>>> > for building Looker's automated modeling, aligning with our > purpose > > >>>> for the > > >>>> > stream-batch automatic pipeline. > > >>>> > > > >>>> > - Cons: The SQL standard uses derived table term extensively, > > >>>> vendors > > >>>> > adopt this for simply referring to a table within a subclause. > > >>>> > > > >>>> > 2. Materialized Table: It means materialize the query result to > > table, > > >>>> > similar to Db2 MQT (Materialized Query Tables). In addition, > > Snowflake > > >>>> > Dynamic Table's predecessor is also called Materialized Table. > > >>>> > > > >>>> > 3. Updating Table (From Timo) > > >>>> > > > >>>> > 4. Updating Materialized View (From Timo) > > >>>> > > > >>>> > 5. Refresh/Live Table (From Martijn) > > >>>> > > > >>>> > As Martijn said, naming is a headache, looking forward to more > > >>>> valuable > > >>>> > input from everyone. > > >>>> > > > >>>> > [1] > > >>>> > > > >>>> > > > https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables > > >>>> > [2] > > >>>> > https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query > > >>>> > [3] > > >>>> > > > >>>> > > > https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables > > >>>> > > > >>>> > Best, > > >>>> > Ron > > >>>> > > > >>>> > Ron liu <ron9....@gmail.com> 于2024年4月7日周日 15:55写道: > > >>>> > > > >>>> >> Hi, Lorenzo > > >>>> >> > > >>>> >> Thank you for your insightful input. > > >>>> >> > > >>>> >>>>> I think the 2 above twisted the materialized view concept to > > more > > >>>> than > > >>>> >> just an optimization for accessing pre-computed > aggregates/filters. > > >>>> >> I think that concept (at least in my mind) is now adherent to the > > >>>> >> semantics of the words themselves ("materialized" and "view") > than > > >>>> on its > > >>>> >> implementations in DBMs, as just a view on raw data that, > > hopefully, > > >>>> is > > >>>> >> constantly updated with fresh results. > > >>>> >> That's why I understand Timo's et al. objections. > > >>>> >> > > >>>> >> Your understanding of Materialized Views is correct. However, in > > our > > >>>> >> scenario, an important feature is the support for Update & Delete > > >>>> >> operations, which the current Materialized Views cannot fulfill. > As > > >>>> we > > >>>> >> discussed with Timo before, if Materialized Views needs to > support > > >>>> data > > >>>> >> modifications, it would require an extension of new keywords, > such > > as > > >>>> >> CREATING xxx (UPDATING) MATERIALIZED VIEW. > > >>>> >> > > >>>> >>>>> Still, I don't understand why we need another type of special > > >>>> table. > > >>>> >> Could you dive deep into the reasons why not simply adding the > > >>>> FRESHNESS > > >>>> >> parameter to standard tables? > > >>>> >> > > >>>> >> Firstly, I need to emphasize that we cannot achieve the design > goal > > >>>> of > > >>>> >> FLIP through the CREATE TABLE syntax combined with a FRESHNESS > > >>>> parameter. > > >>>> >> The proposal of this FLIP is to use Dynamic Table + Continuous > > >>>> Query, and > > >>>> >> combine it with FRESHNESS to realize a streaming-batch > unification. > > >>>> >> However, CREATE TABLE is merely a metadata operation and cannot > > >>>> >> automatically start a background refresh job. To achieve the > design > > >>>> goal of > > >>>> >> FLIP with standard tables, it would require extending the CTAS[1] > > >>>> syntax to > > >>>> >> introduce the FRESHNESS keyword. We considered this design > > >>>> initially, but > > >>>> >> it has following problems: > > >>>> >> > > >>>> >> 1. Distinguishing a table created through CTAS as a standard > table > > >>>> or as a > > >>>> >> "special" standard table with an ongoing background refresh job > > >>>> using the > > >>>> >> FRESHNESS keyword is very obscure for users. > > >>>> >> 2. It intrudes on the semantics of the CTAS syntax. Currently, > > tables > > >>>> >> created using CTAS only add table metadata to the Catalog and do > > not > > >>>> record > > >>>> >> attributes such as query. There are also no ongoing background > > >>>> refresh > > >>>> >> jobs, and the data writing operation happens only once at table > > >>>> creation. > > >>>> >> 3. For the framework, when we perform a certain kind of Alter > Table > > >>>> >> behavior for a table, for the table created by specifying > FRESHNESS > > >>>> and did > > >>>> >> not specify the FRESHNESS created table behavior how to > distinguish > > >>>> , which > > >>>> >> will also cause confusion. > > >>>> >> > > >>>> >> In terms of the design goal of combining Dynamic Table + > Continuous > > >>>> Query, > > >>>> >> the FLIP proposal cannot be realized by only extending the > current > > >>>> stardand > > >>>> >> tables, so a new kind of dynamic table needs to be introduced at > > the > > >>>> >> first-level concept. > > >>>> >> > > >>>> >> [1] > > >>>> >> > > >>>> > > > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#as-select_statement > > >>>> >> > > >>>> >> Best, > > >>>> >> Ron > > >>>> >> > > >>>> >> <lorenzo.affe...@ververica.com.invalid> 于2024年4月3日周三 22:25写道: > > >>>> >> > > >>>> >>> Hello everybody! > > >>>> >>> Thanks for the FLIP as it looks amazing (and I think the prove > is > > >>>> this > > >>>> >>> deep discussion it is provoking :)) > > >>>> >>> > > >>>> >>> I have a couple of comments to add to this: > > >>>> >>> > > >>>> >>> Even though I get the reason why you rejected MATERIALIZED > VIEW, I > > >>>> still > > >>>> >>> like it a lot, and I would like to provide pointers on how the > > >>>> materialized > > >>>> >>> view concept twisted in last years: > > >>>> >>> > > >>>> >>> • Materialize DB (https://materialize.com/) > > >>>> >>> • The famous talk by Martin Kleppmann "turning the database > inside > > >>>> out" ( > > >>>> >>> https://www.youtube.com/watch?v=fU9hR3kiOK0) > > >>>> >>> > > >>>> >>> I think the 2 above twisted the materialized view concept to > more > > >>>> than > > >>>> >>> just an optimization for accessing pre-computed > > aggregates/filters. > > >>>> >>> I think that concept (at least in my mind) is now adherent to > the > > >>>> >>> semantics of the words themselves ("materialized" and "view") > than > > >>>> on its > > >>>> >>> implementations in DBMs, as just a view on raw data that, > > >>>> hopefully, is > > >>>> >>> constantly updated with fresh results. > > >>>> >>> That's why I understand Timo's et al. objections. > > >>>> >>> Still I understand there is no need to add confusion :) > > >>>> >>> > > >>>> >>> Still, I don't understand why we need another type of special > > table. > > >>>> >>> Could you dive deep into the reasons why not simply adding the > > >>>> FRESHNESS > > >>>> >>> parameter to standard tables? > > >>>> >>> > > >>>> >>> I would say that as a very seamless implementation with the goal > > of > > >>>> a > > >>>> >>> unification of batch and streaming. > > >>>> >>> If we stick to a unified world, I think that Flink should just > > >>>> provide 1 > > >>>> >>> type of table that is inherently dynamic. > > >>>> >>> Now, depending on FRESHNESS objectives / connectors used in > WITH, > > >>>> that > > >>>> >>> table can be backed by a stream or batch job as you explained in > > >>>> your FLIP. > > >>>> >>> > > >>>> >>> Maybe I am totally missing the point :) > > >>>> >>> > > >>>> >>> Thank you in advance, > > >>>> >>> Lorenzo > > >>>> >>> On Apr 3, 2024 at 15:25 +0200, Martijn Visser < > > >>>> martijnvis...@apache.org>, > > >>>> >>> wrote: > > >>>> >>>> Hi all, > > >>>> >>>> > > >>>> >>>> Thanks for the proposal. While the FLIP talks extensively on > how > > >>>> >>> Snowflake > > >>>> >>>> has Dynamic Tables and Databricks has Delta Live Tables, my > > >>>> >>> understanding > > >>>> >>>> is that Databricks has CREATE STREAMING TABLE [1] which relates > > >>>> with > > >>>> >>> this > > >>>> >>>> proposal. > > >>>> >>>> > > >>>> >>>> I do have concerns about using CREATE DYNAMIC TABLE, > specifically > > >>>> about > > >>>> >>>> confusing the users who are familiar with Snowflake's approach > > >>>> where you > > >>>> >>>> can't change the content via DML statements, while that is > > >>>> something > > >>>> >>> that > > >>>> >>>> would work in this proposal. Naming is hard of course, but I > > would > > >>>> >>> probably > > >>>> >>>> prefer something like CREATE CONTINUOUS TABLE, CREATE REFRESH > > >>>> TABLE or > > >>>> >>>> CREATE LIVE TABLE. > > >>>> >>>> > > >>>> >>>> Best regards, > > >>>> >>>> > > >>>> >>>> Martijn > > >>>> >>>> > > >>>> >>>> [1] > > >>>> >>>> > > >>>> >>> > > >>>> > > > https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-streaming-table.html > > >>>> >>>> > > >>>> >>>> On Wed, Apr 3, 2024 at 5:19 AM Ron liu <ron9....@gmail.com> > > wrote: > > >>>> >>>> > > >>>> >>>>> Hi, dev > > >>>> >>>>> > > >>>> >>>>> After offline discussion with Becket Qin, Lincoln Lee and Jark > > >>>> Wu, we > > >>>> >>> have > > >>>> >>>>> improved some parts of the FLIP. > > >>>> >>>>> > > >>>> >>>>> 1. Add Full Refresh Mode section to clarify the semantics of > > full > > >>>> >>> refresh > > >>>> >>>>> mode. > > >>>> >>>>> 2. Add Future Improvement section explaining why query > statement > > >>>> does > > >>>> >>> not > > >>>> >>>>> support references to temporary view and possible solutions. > > >>>> >>>>> 3. The Future Improvement section explains a possible future > > >>>> solution > > >>>> >>> for > > >>>> >>>>> dynamic table to support the modification of query statements > to > > >>>> meet > > >>>> >>> the > > >>>> >>>>> common field-level schema evolution requirements of the > > lakehouse. > > >>>> >>>>> 4. The Refresh section emphasizes that the Refresh command and > > the > > >>>> >>>>> background refresh job can be executed in parallel, with no > > >>>> >>> restrictions at > > >>>> >>>>> the framework level. > > >>>> >>>>> 5. Convert RefreshHandler into a plug-in interface to support > > >>>> various > > >>>> >>>>> workflow schedulers. > > >>>> >>>>> > > >>>> >>>>> Best, > > >>>> >>>>> Ron > > >>>> >>>>> > > >>>> >>>>> Ron liu <ron9....@gmail.com> 于2024年4月2日周二 10:28写道: > > >>>> >>>>> > > >>>> >>>>>>> Hi, Venkata krishnan > > >>>> >>>>>>> > > >>>> >>>>>>> Thank you for your involvement and suggestions, and hope > that > > >>>> the > > >>>> >>> design > > >>>> >>>>>>> goals of this FLIP will be helpful to your business. > > >>>> >>>>>>> > > >>>> >>>>>>>>>>>>> 1. In the proposed FLIP, given the example for the > > >>>> >>> dynamic table, do > > >>>> >>>>>>> the > > >>>> >>>>>>> data sources always come from a single lake storage such as > > >>>> >>> Paimon or > > >>>> >>>>> does > > >>>> >>>>>>> the same proposal solve for 2 disparate storage systems like > > >>>> >>> Kafka and > > >>>> >>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to > > >>>> Paimon? > > >>>> >>>>>>> Basically the lambda architecture that is mentioned in the > > FLIP > > >>>> >>> as well. > > >>>> >>>>>>> I'm wondering if it is possible to switch b/w sources based > on > > >>>> the > > >>>> >>>>>>> execution mode, for eg: if it is backfill operation, switch > > to a > > >>>> >>> data > > >>>> >>>>> lake > > >>>> >>>>>>> storage system like Iceberg, otherwise an event streaming > > system > > >>>> >>> like > > >>>> >>>>>>> Kafka. > > >>>> >>>>>>> > > >>>> >>>>>>> Dynamic table is a design abstraction at the framework level > > and > > >>>> >>> is not > > >>>> >>>>>>> tied to the physical implementation of the connector. If a > > >>>> >>> connector > > >>>> >>>>>>> supports a combination of Kafka and lake storage, this works > > >>>> fine. > > >>>> >>>>>>> > > >>>> >>>>>>>>>>>>> 2. What happens in the context of a bootstrap (batch) > + > > >>>> >>> nearline > > >>>> >>>>> update > > >>>> >>>>>>> (streaming) case that are stateful applications? What I mean > > by > > >>>> >>> that is, > > >>>> >>>>>>> will the state from the batch application be transferred to > > the > > >>>> >>> nearline > > >>>> >>>>>>> application after the bootstrap execution is complete? > > >>>> >>>>>>> > > >>>> >>>>>>> I think this is another orthogonal thing, something that > > >>>> FLIP-327 > > >>>> >>> tries > > >>>> >>>>> to > > >>>> >>>>>>> address, not directly related to Dynamic Table. > > >>>> >>>>>>> > > >>>> >>>>>>> [1] > > >>>> >>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-327%3A+Support+switching+from+batch+to+stream+mode+to+improve+throughput+when+processing+backlog+data > > >>>> >>>>>>> > > >>>> >>>>>>> Best, > > >>>> >>>>>>> Ron > > >>>> >>>>>>> > > >>>> >>>>>>> Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 于2024年3月30日周六 > > >>>> >>> 07:06写道: > > >>>> >>>>>>> > > >>>> >>>>>>>>> Ron and Lincoln, > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> Great proposal and interesting discussion for adding > support > > >>>> >>> for dynamic > > >>>> >>>>>>>>> tables within Flink. > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> At LinkedIn, we are also trying to solve compute/storage > > >>>> >>> convergence for > > >>>> >>>>>>>>> similar problems discussed as part of this FLIP, > > specifically > > >>>> >>> periodic > > >>>> >>>>>>>>> backfill, bootstrap + nearline update use cases using > single > > >>>> >>>>>>>>> implementation > > >>>> >>>>>>>>> of business logic (single script). > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> Few clarifying questions: > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> 1. In the proposed FLIP, given the example for the dynamic > > >>>> >>> table, do the > > >>>> >>>>>>>>> data sources always come from a single lake storage such > as > > >>>> >>> Paimon or > > >>>> >>>>> does > > >>>> >>>>>>>>> the same proposal solve for 2 disparate storage systems > like > > >>>> >>> Kafka and > > >>>> >>>>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to > > >>>> >>> Paimon? > > >>>> >>>>>>>>> Basically the lambda architecture that is mentioned in the > > >>>> >>> FLIP as well. > > >>>> >>>>>>>>> I'm wondering if it is possible to switch b/w sources > based > > on > > >>>> >>> the > > >>>> >>>>>>>>> execution mode, for eg: if it is backfill operation, > switch > > to > > >>>> >>> a data > > >>>> >>>>> lake > > >>>> >>>>>>>>> storage system like Iceberg, otherwise an event streaming > > >>>> >>> system like > > >>>> >>>>>>>>> Kafka. > > >>>> >>>>>>>>> 2. What happens in the context of a bootstrap (batch) + > > >>>> >>> nearline update > > >>>> >>>>>>>>> (streaming) case that are stateful applications? What I > mean > > >>>> >>> by that is, > > >>>> >>>>>>>>> will the state from the batch application be transferred > to > > >>>> >>> the nearline > > >>>> >>>>>>>>> application after the bootstrap execution is complete? > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> Regards > > >>>> >>>>>>>>> Venkata krishnan > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>>>>>> On Mon, Mar 25, 2024 at 8:03 PM Ron liu < > ron9....@gmail.com > > > > > >>>> >>> wrote: > > >>>> >>>>>>>>> > > >>>> >>>>>>>>>>> Hi, Timo > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> Thanks for your quick response, and your suggestion. > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> Yes, this discussion has turned into confirming whether > > >>>> >>> it's a special > > >>>> >>>>>>>>>>> table or a special MV. > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> 1. The key problem with MVs is that they don't support > > >>>> >>> modification, > > >>>> >>>>> so > > >>>> >>>>>>>>> I > > >>>> >>>>>>>>>>> prefer it to be a special table. Although the periodic > > >>>> >>> refresh > > >>>> >>>>> behavior > > >>>> >>>>>>>>> is > > >>>> >>>>>>>>>>> more characteristic of an MV, since we are already a > > >>>> >>> special table, > > >>>> >>>>>>>>>>> supporting periodic refresh behavior is quite natural, > > >>>> >>> similar to > > >>>> >>>>>>>>> Snowflake > > >>>> >>>>>>>>>>> dynamic tables. > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> 2. Regarding the keyword UPDATING, since the current > > >>>> >>> Regular Table is > > >>>> >>>>> a > > >>>> >>>>>>>>>>> Dynamic Table, which implies support for updating > through > > >>>> >>> Continuous > > >>>> >>>>>>>>> Query, > > >>>> >>>>>>>>>>> I think it is redundant to add the keyword UPDATING. In > > >>>> >>> addition, > > >>>> >>>>>>>>> UPDATING > > >>>> >>>>>>>>>>> can not reflect the Continuous Query part, can not > express > > >>>> >>> the purpose > > >>>> >>>>>>>>> we > > >>>> >>>>>>>>>>> want to simplify the data pipeline through Dynamic > Table + > > >>>> >>> Continuous > > >>>> >>>>>>>>>>> Query. > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> 3. From the perspective of the SQL standard definition, > I > > >>>> >>> can > > >>>> >>>>> understand > > >>>> >>>>>>>>>>> your concerns about Derived Table, but is it possible to > > >>>> >>> make a slight > > >>>> >>>>>>>>>>> adjustment to meet our needs? Additionally, as Lincoln > > >>>> >>> mentioned, the > > >>>> >>>>>>>>>>> Google Looker platform has introduced Persistent Derived > > >>>> >>> Table, and > > >>>> >>>>>>>>> there > > >>>> >>>>>>>>>>> are precedents in the industry; could Derived Table be a > > >>>> >>> candidate? > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> Of course, look forward to your better suggestions. > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> Best, > > >>>> >>>>>>>>>>> Ron > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>> Timo Walther <twal...@apache.org> 于2024年3月25日周一 > 18:49写道: > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>>>>>> After thinking about this more, this discussion boils > > >>>> >>> down to > > >>>> >>>>> whether > > >>>> >>>>>>>>>>>>> this is a special table or a special materialized > > >>>> >>> view. In both > > >>>> >>>>> cases, > > >>>> >>>>>>>>>>>>> we would need to add a special keyword: > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> Either > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> CREATE UPDATING TABLE > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> or > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> CREATE UPDATING MATERIALIZED VIEW > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> I still feel that the periodic refreshing behavior is > > >>>> >>> closer to a > > >>>> >>>>> MV. > > >>>> >>>>>>>>> If > > >>>> >>>>>>>>>>>>> we add a special keyword to MV, the optimizer would > > >>>> >>> know that the > > >>>> >>>>> data > > >>>> >>>>>>>>>>>>> cannot be used for query optimizations. > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> I will ask more people for their opinion. > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> Regards, > > >>>> >>>>>>>>>>>>> Timo > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> On 25.03.24 10:45, Timo Walther wrote: > > >>>> >>>>>>>>>>>>>>> Hi Ron and Lincoln, > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> thanks for the quick response and the very > > >>>> >>> insightful discussion. > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> we might limit future opportunities to > > >>>> >>> optimize queries > > >>>> >>>>>>>>>>>>>>>>> through automatic materialization rewriting by > > >>>> >>> allowing data > > >>>> >>>>>>>>>>>>>>>>> modifications, thus losing the potential for > > >>>> >>> such > > >>>> >>>>> optimizations. > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> This argument makes a lot of sense to me. Due to > > >>>> >>> the updates, the > > >>>> >>>>>>>>>>> system > > >>>> >>>>>>>>>>>>>>> is not in full control of the persisted data. > > >>>> >>> However, the system > > >>>> >>>>> is > > >>>> >>>>>>>>>>>>>>> still in full control of the job that powers the > > >>>> >>> refresh. So if > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>> system manages all updating pipelines, it could > > >>>> >>> still leverage > > >>>> >>>>>>>>>>> automatic > > >>>> >>>>>>>>>>>>>>> materialization rewriting but without leveraging > > >>>> >>> the data at rest > > >>>> >>>>>>>>> (only > > >>>> >>>>>>>>>>>>>>> the data in flight). > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> we are considering another candidate, Derived > > >>>> >>> Table, the term > > >>>> >>>>>>>>>>> 'derive' > > >>>> >>>>>>>>>>>>>>>>> suggests a query, and 'table' retains > > >>>> >>> modifiability. This > > >>>> >>>>>>>>> approach > > >>>> >>>>>>>>>>>>>>>>> would not disrupt our current concept of a > > >>>> >>> dynamic table > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> I did some research on this term. The SQL standard > > >>>> >>> uses the term > > >>>> >>>>>>>>>>>>>>> "derived table" extensively (defined in section > > >>>> >>> 4.17.3). Thus, a > > >>>> >>>>>>>>> lot of > > >>>> >>>>>>>>>>>>>>> vendors adopt this for simply referring to a table > > >>>> >>> within a > > >>>> >>>>>>>>> subclause: > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://dev.mysql.com/doc/refman/8.0/en/derived-tables.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghdiMp$ > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://infocenter.sybase.com/help/topic/com.sybase.infocenter.dc32300.1600/doc/html/san1390612291252.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737h1gRux$ > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://www.c-sharpcorner.com/article/derived-tables-vs-common-table-expressions/__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739bWIEcL$ > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://stackoverflow.com/questions/26529804/what-are-the-derived-tables-in-my-explain-statement__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739HnGtQf$ > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://www.sqlservercentral.com/articles/sql-derived-tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737DeBiqg$ > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> Esp. the latter example is interesting, SQL Server > > >>>> >>> allows things > > >>>> >>>>>>>>> like > > >>>> >>>>>>>>>>>>>>> this on derived tables: > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> UPDATE T SET Name='Timo' FROM (SELECT * FROM > > >>>> >>> Product) AS T > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> SELECT * FROM Product; > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> Btw also Snowflake's dynamic table state: > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Because the content of a dynamic table is > > >>>> >>> fully determined > > >>>> >>>>>>>>>>>>>>>>> by the given query, the content cannot be > > >>>> >>> changed by using DML. > > >>>> >>>>>>>>>>>>>>>>> You don’t insert, update, or delete the rows > > >>>> >>> in a dynamic > > >>>> >>>>> table. > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> So a new term makes a lot of sense. > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> How about using `UPDATING`? > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> CREATE UPDATING TABLE > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> This reflects that modifications can be made and > > >>>> >>> from an > > >>>> >>>>>>>>>>>>>>> English-language perspective you can PAUSE or > > >>>> >>> RESUME the UPDATING. > > >>>> >>>>>>>>>>>>>>> Thus, a user can define UPDATING interval and mode? > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> Looking forward to your thoughts. > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> Regards, > > >>>> >>>>>>>>>>>>>>> Timo > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>> On 25.03.24 07:09, Ron liu wrote: > > >>>> >>>>>>>>>>>>>>>>> Hi, Ahmed > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Thanks for your feedback. > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Regarding your question: > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> I want to iterate on Timo's comments > > >>>> >>> regarding the confusion > > >>>> >>>>>>>>> between > > >>>> >>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink "Table". > > >>>> >>> Should the refactoring > > >>>> >>>>>>>>> of > > >>>> >>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it in > > >>>> >>> this Flip ( as the > > >>>> >>>>>>>>>>>>>>>>> suggestions > > >>>> >>>>>>>>>>>>>>>>> in the thread ) and address the holistic > > >>>> >>> changes in a separate > > >>>> >>>>> Flip > > >>>> >>>>>>>>>>>>>>>>> for 2.0? > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Lincoln proposed a new concept in reply to > > >>>> >>> Timo: Derived Table, > > >>>> >>>>>>>>> which > > >>>> >>>>>>>>>>>>>>>>> is a > > >>>> >>>>>>>>>>>>>>>>> combination of Dynamic Table + Continuous > > >>>> >>> Query, and the use of > > >>>> >>>>>>>>>>> Derived > > >>>> >>>>>>>>>>>>>>>>> Table will not conflict with existing concepts, > > >>>> >>> what do you > > >>>> >>>>> think? > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> I feel confused with how it is further with > > >>>> >>> other components, > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>>>> examples provided feel like a standalone ETL > > >>>> >>> job, could you > > >>>> >>>>>>>>> provide in > > >>>> >>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>> FLIP an example where the table is further used > > >>>> >>> in subsequent > > >>>> >>>>>>>>> queries > > >>>> >>>>>>>>>>>>>>>>> (specially in batch mode). > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Thanks for your suggestion, I added how to use > > >>>> >>> Dynamic Table in > > >>>> >>>>>>>>> FLIP > > >>>> >>>>>>>>>>>>> user > > >>>> >>>>>>>>>>>>>>>>> story section, Dynamic Table can be referenced > > >>>> >>> by downstream > > >>>> >>>>>>>>> Dynamic > > >>>> >>>>>>>>>>>>>>>>> Table > > >>>> >>>>>>>>>>>>>>>>> and can also support OLAP queries. > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>> Ron > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>> Ron liu <ron9....@gmail.com> 于2024年3月23日周六 > > >>>> >>> 10:35写道: > > >>>> >>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Hi, Feng > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Thanks for your feedback. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> Although currently we restrict users from > > >>>> >>> modifying the query, > > >>>> >>>>> I > > >>>> >>>>>>>>>>>>> wonder > > >>>> >>>>>>>>>>>>>>>>>>> if > > >>>> >>>>>>>>>>>>>>>>>>> we can provide a better way to help users > > >>>> >>> rebuild it without > > >>>> >>>>>>>>>>> affecting > > >>>> >>>>>>>>>>>>>>>>>>> downstream OLAP queries. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Considering the problem of data consistency, > > >>>> >>> so in the first > > >>>> >>>>> step > > >>>> >>>>>>>>> we > > >>>> >>>>>>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>> strictly limited in semantics and do not > > >>>> >>> support modify the > > >>>> >>>>> query. > > >>>> >>>>>>>>>>>>>>>>>>> This is > > >>>> >>>>>>>>>>>>>>>>>>> really a good problem, one of my ideas is to > > >>>> >>> introduce a syntax > > >>>> >>>>>>>>>>>>>>>>>>> similar to > > >>>> >>>>>>>>>>>>>>>>>>> SWAP [1], which supports exchanging two > > >>>> >>> Dynamic Tables. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> From the documentation, the definitions > > >>>> >>> SQL and job > > >>>> >>>>> information > > >>>> >>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this mean that > > >>>> >>> if a system needs to > > >>>> >>>>>>>>> adapt > > >>>> >>>>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to store > > >>>> >>> Flink's job information > > >>>> >>>>> in > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>> corresponding system? > > >>>> >>>>>>>>>>>>>>>>>>> For example, does MySQL's Catalog need to > > >>>> >>> store flink job > > >>>> >>>>>>>>> information > > >>>> >>>>>>>>>>>>> as > > >>>> >>>>>>>>>>>>>>>>>>> well? > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Yes, currently we need to rely on Catalog to > > >>>> >>> store refresh job > > >>>> >>>>>>>>>>>>>>>>>>> information. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> Users still need to consider how much > > >>>> >>> memory is being used, how > > >>>> >>>>>>>>>>> large > > >>>> >>>>>>>>>>>>>>>>>>> the concurrency is, which type of state > > >>>> >>> backend is being used, > > >>>> >>>>> and > > >>>> >>>>>>>>>>>>>>>>>>> may need > > >>>> >>>>>>>>>>>>>>>>>>> to set TTL expiration. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Similar to the current practice, job > > >>>> >>> parameters can be set via > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>> Flink > > >>>> >>>>>>>>>>>>>>>>>>> conf or SET commands > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> When we submit a refresh command, can we > > >>>> >>> help users detect if > > >>>> >>>>>>>>> there > > >>>> >>>>>>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>> any > > >>>> >>>>>>>>>>>>>>>>>>> running jobs and automatically stop them > > >>>> >>> before executing the > > >>>> >>>>>>>>> refresh > > >>>> >>>>>>>>>>>>>>>>>>> command? Then wait for it to complete before > > >>>> >>> restarting the > > >>>> >>>>>>>>>>> background > > >>>> >>>>>>>>>>>>>>>>>>> streaming job? > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Purely from a technical implementation point > > >>>> >>> of view, your > > >>>> >>>>>>>>> proposal > > >>>> >>>>>>>>>>> is > > >>>> >>>>>>>>>>>>>>>>>>> doable, but it would be more costly. Also I > > >>>> >>> think data > > >>>> >>>>> consistency > > >>>> >>>>>>>>>>>>>>>>>>> itself > > >>>> >>>>>>>>>>>>>>>>>>> is the responsibility of the user, similar > > >>>> >>> to how Regular Table > > >>>> >>>>> is > > >>>> >>>>>>>>>>>>>>>>>>> now also > > >>>> >>>>>>>>>>>>>>>>>>> the responsibility of the user, so it's > > >>>> >>> consistent with its > > >>>> >>>>>>>>> behavior > > >>>> >>>>>>>>>>>>>>>>>>> and no > > >>>> >>>>>>>>>>>>>>>>>>> additional guarantees are made at the engine > > >>>> >>> level. > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>> Ron > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>> Ahmed Hamdy <hamdy10...@gmail.com> > > >>>> >>> 于2024年3月22日周五 23:50写道: > > >>>> >>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> Hi Ron, > > >>>> >>>>>>>>>>>>>>>>>>>>> Sorry for joining the discussion late, > > >>>> >>> thanks for the effort. > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> I think the base idea is great, however I > > >>>> >>> have a couple of > > >>>> >>>>>>>>> comments: > > >>>> >>>>>>>>>>>>>>>>>>>>> - I want to iterate on Timo's comments > > >>>> >>> regarding the confusion > > >>>> >>>>>>>>>>> between > > >>>> >>>>>>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink > > >>>> >>> "Table". Should the > > >>>> >>>>>>>>> refactoring of > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it > > >>>> >>> in this Flip ( as the > > >>>> >>>>>>>>>>>>>>>>>>>>> suggestions > > >>>> >>>>>>>>>>>>>>>>>>>>> in the thread ) and address the holistic > > >>>> >>> changes in a separate > > >>>> >>>>>>>>> Flip > > >>>> >>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>> 2.0? > > >>>> >>>>>>>>>>>>>>>>>>>>> - I feel confused with how it is further > > >>>> >>> with other components, > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>> examples provided feel like a standalone > > >>>> >>> ETL job, could you > > >>>> >>>>>>>>> provide > > >>>> >>>>>>>>>>>>>>>>>>>>> in the > > >>>> >>>>>>>>>>>>>>>>>>>>> FLIP an example where the table is > > >>>> >>> further used in subsequent > > >>>> >>>>>>>>>>> queries > > >>>> >>>>>>>>>>>>>>>>>>>>> (specially in batch mode). > > >>>> >>>>>>>>>>>>>>>>>>>>> - I really like the standard of keeping > > >>>> >>> the unified batch and > > >>>> >>>>>>>>>>>>> streaming > > >>>> >>>>>>>>>>>>>>>>>>>>> approach > > >>>> >>>>>>>>>>>>>>>>>>>>> Best Regards > > >>>> >>>>>>>>>>>>>>>>>>>>> Ahmed Hamdy > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> On Fri, 22 Mar 2024 at 12:07, Lincoln Lee > > >>>> >>> < > > >>>> >>>>>>>>> lincoln.8...@gmail.com> > > >>>> >>>>>>>>>>>>>>>>>>>>> wrote: > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Timo, > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for your thoughtful inputs! > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Yes, expanding the MATERIALIZED > > >>>> >>> VIEW(MV) could achieve the > > >>>> >>>>> same > > >>>> >>>>>>>>>>>>>>>>>>>>> function, > > >>>> >>>>>>>>>>>>>>>>>>>>>>> but our primary concern is that by > > >>>> >>> using a view, we might > > >>>> >>>>> limit > > >>>> >>>>>>>>>>>>> future > > >>>> >>>>>>>>>>>>>>>>>>>>>>> opportunities > > >>>> >>>>>>>>>>>>>>>>>>>>>>> to optimize queries through automatic > > >>>> >>> materialization > > >>>> >>>>> rewriting > > >>>> >>>>>>>>>>> [1], > > >>>> >>>>>>>>>>>>>>>>>>>>>>> leveraging > > >>>> >>>>>>>>>>>>>>>>>>>>>>> the support for MV by physical > > >>>> >>> storage. This is because we > > >>>> >>>>>>>>> would be > > >>>> >>>>>>>>>>>>>>>>>>>>>>> breaking > > >>>> >>>>>>>>>>>>>>>>>>>>>>> the intuitive semantics of a > > >>>> >>> materialized view (a materialized > > >>>> >>>>>>>>> view > > >>>> >>>>>>>>>>>>>>>>>>>>>>> represents > > >>>> >>>>>>>>>>>>>>>>>>>>>>> the result of a query) by allowing > > >>>> >>> data modifications, thus > > >>>> >>>>>>>>> losing > > >>>> >>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>> potential > > >>>> >>>>>>>>>>>>>>>>>>>>>>> for such optimizations. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> With these considerations in mind, we > > >>>> >>> were inspired by Google > > >>>> >>>>>>>>>>>>> Looker's > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Persistent > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Derived Table [2]. PDT is designed for > > >>>> >>> building Looker's > > >>>> >>>>>>>>> automated > > >>>> >>>>>>>>>>>>>>>>>>>>>>> modeling, > > >>>> >>>>>>>>>>>>>>>>>>>>>>> aligning with our purpose for the > > >>>> >>> stream-batch automatic > > >>>> >>>>>>>>> pipeline. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Therefore, > > >>>> >>>>>>>>>>>>>>>>>>>>>>> we are considering another candidate, > > >>>> >>> Derived Table, the term > > >>>> >>>>>>>>>>>>> 'derive' > > >>>> >>>>>>>>>>>>>>>>>>>>>>> suggests a > > >>>> >>>>>>>>>>>>>>>>>>>>>>> query, and 'table' retains > > >>>> >>> modifiability. This approach would > > >>>> >>>>>>>>> not > > >>>> >>>>>>>>>>>>>>>>>>>>> disrupt > > >>>> >>>>>>>>>>>>>>>>>>>>>>> our current > > >>>> >>>>>>>>>>>>>>>>>>>>>>> concept of a dynamic table, preserving > > >>>> >>> the future utility of > > >>>> >>>>>>>>> MVs. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Conceptually, a Derived Table is a > > >>>> >>> Dynamic Table + Continuous > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Query. By > > >>>> >>>>>>>>>>>>>>>>>>>>>>> introducing > > >>>> >>>>>>>>>>>>>>>>>>>>>>> a new concept Derived Table for this > > >>>> >>> FLIP, this makes all > > >>>> >>>>>>>>>>>>>>>>>>>>>>> concepts to > > >>>> >>>>>>>>>>>>>>>>>>>>> play > > >>>> >>>>>>>>>>>>>>>>>>>>>>> together nicely. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> What do you think about this? > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> [1] > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://calcite.apache.org/docs/materialized_views.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73_NFf4D5$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>> [2] > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://cloud.google.com/looker/docs/derived-tables*persistent_derived_tables__;Iw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7382-2zI3$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Timo Walther <twal...@apache.org> > > >>>> >>> 于2024年3月22日周五 17:54写道: > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> thanks for the detailed answer. > > >>>> >>> Sorry, for my late reply, we > > >>>> >>>>>>>>> had a > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> conference that kept me busy. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the current concept[1], it > > >>>> >>> actually includes: Dynamic > > >>>> >>>>>>>>>>> Tables > > >>>> >>>>>>>>>>>>> & > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> & Continuous Query. Dynamic > > >>>> >>> Table is just an abstract > > >>>> >>>>>>>>> logical > > >>>> >>>>>>>>>>>>>>>>>>>>> concept > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This explanation makes sense to me. > > >>>> >>> But the docs also say "A > > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> query is evaluated on the dynamic > > >>>> >>> table yielding a new > > >>>> >>>>> dynamic > > >>>> >>>>>>>>>>>>>>>>>>>>> table.". > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> So even our regular CREATE TABLEs > > >>>> >>> are considered dynamic > > >>>> >>>>>>>>> tables. > > >>>> >>>>>>>>>>>>> This > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> can also be seen in the diagram > > >>>> >>> "Dynamic Table -> Continuous > > >>>> >>>>>>>>> Query > > >>>> >>>>>>>>>>>>> -> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table". Currently, Flink > > >>>> >>> queries can only be executed > > >>>> >>>>>>>>> on > > >>>> >>>>>>>>>>>>>>>>>>>>> Dynamic > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Tables. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In essence, a materialized view > > >>>> >>> represents the result of > > >>>> >>>>> a > > >>>> >>>>>>>>>>>>> query. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Isn't that what your proposal does > > >>>> >>> as well? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the object of the suspend > > >>>> >>> operation is the refresh task > > >>>> >>>>> of > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> dynamic table > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand that Snowflake uses > > >>>> >>> the term [1] to merge their > > >>>> >>>>>>>>>>>>> concepts > > >>>> >>>>>>>>>>>>>>>>>>>>> of > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> STREAM, TASK, and TABLE into one > > >>>> >>> piece of concept. But Flink > > >>>> >>>>>>>>> has > > >>>> >>>>>>>>>>> no > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concept of a "refresh task". Also, > > >>>> >>> they already introduced > > >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> VIEW. Flink is in the convenient > > >>>> >>> position that the concept of > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> materialized views is not taken > > >>>> >>> (reserved maybe for exactly > > >>>> >>>>>>>>> this > > >>>> >>>>>>>>>>> use > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> case?). And SQL standard concept > > >>>> >>> could be "slightly adapted" > > >>>> >>>>> to > > >>>> >>>>>>>>>>> our > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> needs. Looking at other vendors > > >>>> >>> like Postgres[2], they also > > >>>> >>>>> use > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> `REFRESH` commands so why not > > >>>> >>> adding additional commands such > > >>>> >>>>>>>>> as > > >>>> >>>>>>>>>>>>>>>>>>>>> DELETE > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> or UPDATE. Oracle supports "ON > > >>>> >>> PREBUILT TABLE clause tells > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>> database > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> to use an existing table > > >>>> >>> segment"[3] which comes closer to > > >>>> >>>>>>>>> what we > > >>>> >>>>>>>>>>>>>>>>>>>>> want > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> as well. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> it is not intended to support > > >>>> >>> data modification > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This is an argument that I > > >>>> >>> understand. But we as Flink could > > >>>> >>>>>>>>> allow > > >>>> >>>>>>>>>>>>>>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> modifications. This way we are only > > >>>> >>> extending the standard > > >>>> >>>>> and > > >>>> >>>>>>>>>>> don't > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> introduce new concepts. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> If we can't agree on using > > >>>> >>> MATERIALIZED VIEW concept. We > > >>>> >>>>> should > > >>>> >>>>>>>>>>> fix > > >>>> >>>>>>>>>>>>>>>>>>>>> our > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> syntax in a Flink 2.0 effort. > > >>>> >>> Making regular tables bounded > > >>>> >>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>> dynamic > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> tables unbounded. We would be > > >>>> >>> closer to the SQL standard with > > >>>> >>>>>>>>> this > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> pave the way for the future. I > > >>>> >>> would actually support this if > > >>>> >>>>>>>>> all > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concepts play together nicely. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the future, we can consider > > >>>> >>> extending the statement > > >>>> >>>>> set > > >>>> >>>>>>>>>>>>> syntax > > >>>> >>>>>>>>>>>>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> support the creation of multiple > > >>>> >>> dynamic tables. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> It's good that we called the > > >>>> >>> concept STATEMENT SET. This > > >>>> >>>>>>>>> allows us > > >>>> >>>>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> defined CREATE TABLE within. Even > > >>>> >>> if it might look a bit > > >>>> >>>>>>>>>>> confusing. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Regards, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Timo > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [1] > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [2] > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://www.postgresql.org/docs/current/sql-creatematerializedview.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zbNhvS7$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [3] > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://oracle-base.com/articles/misc/materialized-views__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739xS1kvD$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 21.03.24 04:14, Feng Jin wrote: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron and Lincoln > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this > > >>>> >>> discussion. I believe it will > > >>>> >>>>> greatly > > >>>> >>>>>>>>>>>>>>>>>>>>> improve > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> convenience of managing user > > >>>> >>> real-time pipelines. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I have some questions. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding Limitations of > > >>>> >>> Dynamic Table:* > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Does not support modifying > > >>>> >>> the select statement after the > > >>>> >>>>>>>>>>> dynamic > > >>>> >>>>>>>>>>>>>>>>>>>>>>> table > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> is created. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Although currently we restrict > > >>>> >>> users from modifying the > > >>>> >>>>>>>>> query, I > > >>>> >>>>>>>>>>>>>>>>>>>>> wonder > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> if > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> we can provide a better way to > > >>>> >>> help users rebuild it without > > >>>> >>>>>>>>>>>>>>>>>>>>> affecting > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> downstream OLAP queries. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the management of > > >>>> >>> background jobs:* > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. From the documentation, the > > >>>> >>> definitions SQL and job > > >>>> >>>>>>>>>>> information > > >>>> >>>>>>>>>>>>>>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this > > >>>> >>> mean that if a system needs > > >>>> >>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>> adapt > > >>>> >>>>>>>>>>>>>>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to > > >>>> >>> store Flink's job > > >>>> >>>>>>>>> information in > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding system? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> For example, does MySQL's > > >>>> >>> Catalog need to store flink job > > >>>> >>>>>>>>>>>>>>>>>>>>> information > > >>>> >>>>>>>>>>>>>>>>>>>>>>> as > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> well? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Users still need to consider > > >>>> >>> how much memory is being > > >>>> >>>>> used, > > >>>> >>>>>>>>>>> how > > >>>> >>>>>>>>>>>>>>>>>>>>>>> large > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the concurrency is, which type > > >>>> >>> of state backend is being > > >>>> >>>>> used, > > >>>> >>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>> may > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> need > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> to set TTL expiration. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the Refresh Part:* > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> If the refresh mode is > > >>>> >>> continuous and a background job is > > >>>> >>>>>>>>>>> running, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> caution should be taken with the > > >>>> >>> refresh command as it can > > >>>> >>>>>>>>> lead > > >>>> >>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> inconsistent data. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> When we submit a refresh > > >>>> >>> command, can we help users detect > > >>>> >>>>> if > > >>>> >>>>>>>>>>> there > > >>>> >>>>>>>>>>>>>>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> any > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> running jobs and automatically > > >>>> >>> stop them before executing > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>> refresh > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> command? Then wait for it to > > >>>> >>> complete before restarting the > > >>>> >>>>>>>>>>>>>>>>>>>>> background > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> streaming job? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Feng > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 19, 2024 at 9:40 PM > > >>>> >>> Lincoln Lee < > > >>>> >>>>>>>>>>>>> lincoln.8...@gmail.com > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yun, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your > > >>>> >>> valuable input! > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Incremental mode is indeed an > > >>>> >>> attractive idea, we have also > > >>>> >>>>>>>>>>>>>>>>>>>>> discussed > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, but in the current > > >>>> >>> design, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we first provided two refresh > > >>>> >>> modes: CONTINUOUS and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FULL. Incremental mode can be > > >>>> >>> introduced > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> once the execution layer has > > >>>> >>> the capability. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> My answer for the two > > >>>> >>> questions: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, cascading is a good > > >>>> >>> question. Current proposal > > >>>> >>>>>>>>> provides a > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness that defines a > > >>>> >>> dynamic > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table relative to the base > > >>>> >>> table’s lag. If users need to > > >>>> >>>>>>>>>>> consider > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> end-to-end freshness of > > >>>> >>> multiple > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> cascaded dynamic tables, he > > >>>> >>> can manually split them for > > >>>> >>>>> now. > > >>>> >>>>>>>>> Of > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> course, how to let multiple > > >>>> >>> cascaded > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> or dependent dynamic tables > > >>>> >>> complete the freshness > > >>>> >>>>>>>>> definition > > >>>> >>>>>>>>>>>>> in > > >>>> >>>>>>>>>>>>>>>>>>>>> a > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler way, I think it can be > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> extended in the future. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cascading refresh is also a > > >>>> >>> part we focus on discussing. In > > >>>> >>>>>>>>> this > > >>>> >>>>>>>>>>>>>>>>>>>>> flip, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we hope to focus as much as > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> possible on the core features > > >>>> >>> (as it already involves a lot > > >>>> >>>>>>>>>>>>>>>>>>>>> things), > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> so we did not directly > > >>>> >>> introduce related > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax. However, based on the > > >>>> >>> current design, combined > > >>>> >>>>>>>>> with > > >>>> >>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> catalog and lineage, > > >>>> >>> theoretically, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can also finish the > > >>>> >>> cascading refresh. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang <myas...@live.com> > > >>>> >>> 于2024年3月19日周二 13:45写道: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this > > >>>> >>> discussion, and I am so excited to > > >>>> >>>>>>>>> see > > >>>> >>>>>>>>>>>>>>>>>>>>> this > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> topic > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> being discussed in the > > >>>> >>> Flink community! > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From my point of view, > > >>>> >>> instead of the work of unifying > > >>>> >>>>>>>>>>>>> streaming > > >>>> >>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in DataStream API [1], > > >>>> >>> this FLIP actually could make users > > >>>> >>>>>>>>>>>>> benefit > > >>>> >>>>>>>>>>>>>>>>>>>>>>> from > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> one > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> engine to rule batch & > > >>>> >>> streaming. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we treat this FLIP as > > >>>> >>> an open-source implementation of > > >>>> >>>>>>>>>>>>>>>>>>>>> Snowflake's > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic tables [2], we > > >>>> >>> still lack an incremental refresh > > >>>> >>>>>>>>> mode > > >>>> >>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>> make > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ETL near real-time with a > > >>>> >>> much cheaper computation cost. > > >>>> >>>>>>>>>>> However, > > >>>> >>>>>>>>>>>>>>>>>>>>> I > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> think > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this could be done under > > >>>> >>> the current design by introducing > > >>>> >>>>>>>>>>>>> another > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mode in the future. > > >>>> >>> Although the extra work of incremental > > >>>> >>>>>>>>> view > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> maintenance > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be much larger. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the FLIP itself, I > > >>>> >>> have several questions below: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. It seems this FLIP does > > >>>> >>> not consider the lag of > > >>>> >>>>> refreshes > > >>>> >>>>>>>>>>>>>>>>>>>>> across > > >>>> >>>>>>>>>>>>>>>>>>>>>>> ETL > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> layers from ODS ---> DWD > > >>>> >>> ---> APP [3]. We currently only > > >>>> >>>>>>>>>>> consider > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scheduler interval, which > > >>>> >>> means we cannot use lag to > > >>>> >>>>>>>>>>>>> automatically > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> schedule > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the upfront micro-batch > > >>>> >>> jobs to do the work. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. To support the > > >>>> >>> automagical refreshes, we should > > >>>> >>>>> consider > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>> lineage > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the catalog or somewhere > > >>>> >>> else. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-134*3A*Batch*execution*for*the*DataStream*API__;JSsrKysrKw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7352JICzI$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] > > >>>> >>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] > > >>>> >>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>> > > >>>> >>>>>>>>> > > >>>> >>>>> > > >>>> >>> > > >>>> > > > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghqpxk$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>> ________________________________ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From: Lincoln Lee < > > >>>> >>> lincoln.8...@gmail.com> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, March 14, > > >>>> >>> 2024 14:35 > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To: dev@flink.apache.org < > > >>>> >>> dev@flink.apache.org> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] > > >>>> >>> FLIP-435: Introduce a New Dynamic > > >>>> >>>>>>>>> Table > > >>>> >>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simplifying Data Pipelines > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jing, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for your attention > > >>>> >>> to this flip! I'll try to answer > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> following > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> questions. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define query > > >>>> >>> of dynamic table? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or > > >>>> >>> introducing new syntax? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, how > > >>>> >>> to handle the difference in SQL > > >>>> >>>>>>>>> between > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> streaming > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query > > >>>> >>> including window aggregate based on > > >>>> >>>>>>>>>>>>>>>>>>>>> processing > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including > > >>>> >>> global order by? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Similar to `CREATE TABLE > > >>>> >>> AS query`, here the `query` also > > >>>> >>>>>>>>> uses > > >>>> >>>>>>>>>>>>>>>>>>>>> Flink > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> sql > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't introduce a > > >>>> >>> totally new syntax. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We will not change the > > >>>> >>> status respect to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the difference in > > >>>> >>> functionality of flink sql itself on > > >>>> >>>>>>>>>>> streaming > > >>>> >>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch, for example, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the proctime window agg on > > >>>> >>> streaming and global sort on > > >>>> >>>>>>>>> batch > > >>>> >>>>>>>>>>>>> that > > >>>> >>>>>>>>>>>>>>>>>>>>>>> you > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in fact, do not work > > >>>> >>> properly in the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other mode, so when the > > >>>> >>> user modifies the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh mode of a dynamic > > >>>> >>> table that is not supported, we > > >>>> >>>>>>>>> will > > >>>> >>>>>>>>>>>>>>>>>>>>> throw > > >>>> >>>>>>>>>>>>>>>>>>>>>>> an > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify the > > >>>> >>> query of dynamic table is allowed? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only > > >>>> >>> refresh a dynamic table based on the > > >>>> >>>>>>>>> initial > > >>>> >>>>>>>>>>>>>>>>>>>>> query? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, in the current > > >>>> >>> design, the query definition of the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table is not > > >>>> >>> allowed > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to be modified, and you > > >>>> >>> can only refresh the data based > > >>>> >>>>>>>>> on > > >>>> >>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial definition. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use dynamic > > >>>> >>> table? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table seems > > >>>> >>> to be similar to the materialized > > >>>> >>>>>>>>>>> view. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Will > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like > > >>>> >>> materialized view rewriting during the > > >>>> >>>>>>>>>>>>>>>>>>>>> optimization? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It's true that dynamic > > >>>> >>> table and materialized view > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are similar in some ways, > > >>>> >>> but as Ron > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explains > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are differences. In > > >>>> >>> terms of optimization, automated > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialization discovery > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> similar to that supported > > >>>> >>> by calcite is also a potential > > >>>> >>>>>>>>>>>>>>>>>>>>> possibility, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> perhaps with the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> addition of automated > > >>>> >>> rewriting in the future. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron liu < > > >>>> >>> ron9....@gmail.com> 于2024年3月14日周四 14:01写道: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Timo > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for later > > >>>> >>> response, thanks for your feedback. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding your > > >>>> >>> questions: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has introduced > > >>>> >>> the concept of Dynamic Tables many > > >>>> >>>>>>>>> years > > >>>> >>>>>>>>>>>>>>>>>>>>> ago. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> How > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term "Dynamic > > >>>> >>> Table" fit into Flink's regular > > >>>> >>>>>>>>> tables > > >>>> >>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>> also > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it relate to > > >>>> >>> Table API? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that adding > > >>>> >>> the DYNAMIC TABLE keyword could cause > > >>>> >>>>>>>>>>>>>>>>>>>>> confusion > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a > > >>>> >>> term for regular CREATE TABLE (that can > > >>>> >>>>>>>>> be > > >>>> >>>>>>>>>>>>>>>>>>>>> "kind > > >>>> >>>>>>>>>>>>>>>>>>>>>>> of > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well and > > >>>> >>> is backed by a changelog) is then > > >>>> >>>>>>>>>>> missing. > > >>>> >>>>>>>>>>>>>>>>>>>>>>> Also > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we call > > >>>> >>> our connectors for those tables, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and DynamicTableSink. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I find > > >>>> >>> it contradicting that a TABLE can be > > >>>> >>>>>>>>>>>>>>>>>>>>> "paused" or > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From an > > >>>> >>> English language perspective, this > > >>>> >>>>> does > > >>>> >>>>>>>>>>>>> sound > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my > > >>>> >>> opinion (without much research yet), a > > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger > > >>>> >>> should rather be modelled as a CREATE > > >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are > > >>>> >>> familiar with?) or a new concept such > > >>>> >>>>> as > > >>>> >>>>>>>>> a > > >>>> >>>>>>>>>>>>>>>>>>>>> CREATE > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be paused > > >>>> >>> and resumed?). > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In the current > > >>>> >>> concept[1], it actually includes: Dynamic > > >>>> >>>>>>>>>>> Tables > > >>>> >>>>>>>>>>>>> & > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. > > >>>> >>> Dynamic Table is just an abstract > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logical concept > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , which in its physical > > >>>> >>> form represents either a table > > >>>> >>>>> or a > > >>>> >>>>>>>>>>>>>>>>>>>>>>> changelog > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> stream. It requires the > > >>>> >>> combination with Continuous Query > > >>>> >>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>> achieve > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic updates of the > > >>>> >>> target table similar to a > > >>>> >>>>> database’s > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Materialized View. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We hope to upgrade the > > >>>> >>> Dynamic Table to a real entity > > >>>> >>>>> that > > >>>> >>>>>>>>>>> users > > >>>> >>>>>>>>>>>>>>>>>>>>> can > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operate, which combines > > >>>> >>> the logical concepts of Dynamic > > >>>> >>>>>>>>>>> Tables + > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. By > > >>>> >>> integrating the definition of tables > > >>>> >>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>> queries, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can achieve > > >>>> >>> functions similar to Materialized Views, > > >>>> >>>>>>>>>>>>>>>>>>>>> simplifying > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users' data processing > > >>>> >>> pipelines. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, the object of the > > >>>> >>> suspend operation is the refresh > > >>>> >>>>>>>>> task of > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table. The > > >>>> >>> command `ALTER DYNAMIC TABLE > > >>>> >>>>> table_name > > >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND > > >>>> >>>>>>>>>>>>>>>>>>>>>>> ` > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is actually a shorthand > > >>>> >>> for `ALTER DYNAMIC TABLE > > >>>> >>>>> table_name > > >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> REFRESH` (if written in > > >>>> >>> full for clarity, we can also > > >>>> >>>>>>>>> modify > > >>>> >>>>>>>>>>>>> it). > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Initially, we also > > >>>> >>> considered Materialized Views > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , but ultimately > > >>>> >>> decided against them. Materialized views > > >>>> >>>>>>>>> are > > >>>> >>>>>>>>>>>>>>>>>>>>>>> designed > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to enhance query > > >>>> >>> performance for workloads that consist > > >>>> >>>>> of > > >>>> >>>>>>>>>>>>>>>>>>>>> common, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> repetitive query > > >>>> >>> patterns. In essence, a materialized > > >>>> >>>>> view > > >>>> >>>>>>>>>>>>>>>>>>>>>>> represents > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the result of a query. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, it is not > > >>>> >>> intended to support data modification. > > >>>> >>>>>>>>> For > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lakehouse scenarios, > > >>>> >>> where the ability to delete or > > >>>> >>>>> update > > >>>> >>>>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>> is > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> crucial (such as > > >>>> >>> compliance with GDPR, FLIP-2), > > >>>> >>>>>>>>> materialized > > >>>> >>>>>>>>>>>>>>>>>>>>> views > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fall short. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Compared to CREATE > > >>>> >>> (regular) TABLE, CREATE DYNAMIC TABLE > > >>>> >>>>>>>>> not > > >>>> >>>>>>>>>>>>> only > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> defines metadata in the > > >>>> >>> catalog but also automatically > > >>>> >>>>>>>>>>> initiates > > >>>> >>>>>>>>>>>>>>>>>>>>> a > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data refresh task based > > >>>> >>> on the query specified during > > >>>> >>>>> table > > >>>> >>>>>>>>>>>>>>>>>>>>>>> creation. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It dynamically executes > > >>>> >>> data updates. Users can focus on > > >>>> >>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dependencies and data > > >>>> >>> generation logic. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The new dynamic table > > >>>> >>> does not conflict with the existing > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource and > > >>>> >>> DynamicTableSink interfaces. For > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>> developer, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all that needs to be > > >>>> >>> implemented is the new > > >>>> >>>>>>>>>>> CatalogDynamicTable, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> without changing the > > >>>> >>> implementation of source and sink. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5. For now, the FLIP > > >>>> >>> does not consider supporting Table > > >>>> >>>>> API > > >>>> >>>>>>>>>>>>>>>>>>>>>>> operations > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> . However, once the SQL > > >>>> >>> syntax is finalized, we can > > >>>> >>>>> discuss > > >>>> >>>>>>>>>>> this > > >>>> >>>>>>>>>>>>>>>>>>>>> in > > >>>> >>>>>>>>>>>>>>>>>>>>>>> a > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> separate FLIP. > > >>>> >>> Currently, I have a rough idea: the Table > > >>>> >>>>>>>>> API > > >>>> >>>>>>>>>>>>>>>>>>>>> should > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also introduce > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTable operation > > >>>> >>> interfaces > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding to the > > >>>> >>> existing Table interfaces. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The TableEnvironment > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will provide relevant > > >>>> >>> methods to support various > > >>>> >>>>> dynamic > > >>>> >>>>>>>>>>>>> table > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operations. The goal > > >>>> >>> for the new Dynamic Table is to > > >>>> >>>>> offer > > >>>> >>>>>>>>>>> users > > >>>> >>>>>>>>>>>>>>>>>>>>> an > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> experience similar to > > >>>> >>> using a database, which is why we > > >>>> >>>>>>>>>>>>>>>>>>>>> prioritize > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SQL-based approaches > > >>>> >>> initially. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you envision > > >>>> >>> re-adding the functionality of a > > >>>> >>>>>>>>>>> statement > > >>>> >>>>>>>>>>>>>>>>>>>>> set, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to multiple > > >>>> >>> tables? This is a very important > > >>>> >>>>> use > > >>>> >>>>>>>>>>> case > > >>>> >>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Multi-tables is indeed > > >>>> >>> a very important user scenario. In > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>> future, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we can consider > > >>>> >>> extending the statement set syntax to > > >>>> >>>>>>>>> support > > >>>> >>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> creation of multiple > > >>>> >>> dynamic tables. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early > > >>>> >>> days of Flink SQL, we were discussing > > >>>> >>>>>>>>>>> `SELECT > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 > > >>>> >>> MINUTES`. Your proposal seems to rephrase > > >>>> >>>>>>>>>>> STREAM > > >>>> >>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other keywords > > >>>> >>> DYNAMIC TABLE and FRESHNESS. But the > > >>>> >>>>>>>>> core > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is > > >>>> >>> still there. I'm wondering if we should > > >>>> >>>>>>>>>>> widen > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part of > > >>>> >>> this FLIP but a new FLIP) to follow > > >>>> >>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>> standard > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making > > >>>> >>> `SELECT * FROM t` bounded by default and > > >>>> >>>>>>>>> use > > >>>> >>>>>>>>>>>>> new > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic > > >>>> >>> behavior. Flink 2.0 would be the perfect > > >>>> >>>>>>>>> time > > >>>> >>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would > > >>>> >>> require careful discussions. What do > > >>>> >>>>> you > > >>>> >>>>>>>>>>>>>>>>>>>>> think? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The query part indeed > > >>>> >>> requires a separate FLIP > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for discussion, as it > > >>>> >>> involves changes to the default > > >>>> >>>>>>>>>>> behaviorhttps://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/dynamic_tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73477_wHn$ > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang < > > >>>> >>> beyond1...@gmail.com> 于2024年3月13日周三 15:19写道: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Lincoln & Ron, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the > > >>>> >>> proposal. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the > > >>>> >>> question raised by Timo. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Besides, I have some > > >>>> >>> other questions. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define > > >>>> >>> query of dynamic table? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or > > >>>> >>> introducing new syntax? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, > > >>>> >>> how to handle the difference in SQL > > >>>> >>>>>>>>> between > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> streaming > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query > > >>>> >>> including window aggregate based on > > >>>> >>>>>>>>>>>>>>>>>>>>> processing > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including > > >>>> >>> global order by? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify > > >>>> >>> the query of dynamic table is allowed? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only > > >>>> >>> refresh a dynamic table based on > > >>>> >>>>> initial > > >>>> >>>>>>>>>>>>> query? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use > > >>>> >>> dynamic table? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table > > >>>> >>> seems to be similar with materialized > > >>>> >>>>>>>>> view. > > >>>> >>>>>>>>>>>>>>>>>>>>> Will > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like > > >>>> >>> materialized view rewriting during the > > >>>> >>>>>>>>>>>>>>>>>>>>> optimization? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo Walther < > > >>>> >>> twal...@apache.org> 于2024年3月13日周三 01:24写 > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 道: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln & Ron, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for > > >>>> >>> proposing this FLIP. I think a design > > >>>> >>>>> similar > > >>>> >>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>> what > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> you > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> propose has been > > >>>> >>> in the heads of many people, however, > > >>>> >>>>>>>>> I'm > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this will fit > > >>>> >>> into the bigger picture. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I haven't deeply > > >>>> >>> reviewed the FLIP yet, but would like > > >>>> >>>>> to > > >>>> >>>>>>>>>>> ask > > >>>> >>>>>>>>>>>>>>>>>>>>> some > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial questions: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has > > >>>> >>> introduced the concept of Dynamic Tables many > > >>>> >>>>>>>>>>> years > > >>>> >>>>>>>>>>>>>>>>>>>>> ago. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term > > >>>> >>> "Dynamic Table" fit into Flink's regular > > >>>> >>>>>>>>>>> tables > > >>>> >>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it > > >>>> >>> relate to Table API? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that > > >>>> >>> adding the DYNAMIC TABLE keyword could > > >>>> >>>>> cause > > >>>> >>>>>>>>>>>>>>>>>>>>> confusion > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a > > >>>> >>> term for regular CREATE TABLE (that > > >>>> >>>>> can > > >>>> >>>>>>>>> be > > >>>> >>>>>>>>>>>>>>>>>>>>> "kind > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> of > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well > > >>>> >>> and is backed by a changelog) is then > > >>>> >>>>>>>>>>>>> missing. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we > > >>>> >>> call our connectors for those tables, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > > >>>> >>> DynamicTableSink. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I > > >>>> >>> find it contradicting that a TABLE can be > > >>>> >>>>>>>>>>>>>>>>>>>>> "paused" > > >>>> >>>>>>>>>>>>>>>>>>>>>>> or > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From > > >>>> >>> an English language perspective, this > > >>>> >>>>>>>>> does > > >>>> >>>>>>>>>>>>>>>>>>>>> sound > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my > > >>>> >>> opinion (without much research yet), a > > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger > > >>>> >>> should rather be modelled as a CREATE > > >>>> >>>>>>>>>>>>>>>>>>>>>>> MATERIALIZED > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are > > >>>> >>> familiar with?) or a new concept such > > >>>> >>>>>>>>> as a > > >>>> >>>>>>>>>>>>>>>>>>>>> CREATE > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be > > >>>> >>> paused and resumed?). > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you > > >>>> >>> envision re-adding the functionality of a > > >>>> >>>>>>>>>>> statement > > >>>> >>>>>>>>>>>>>>>>>>>>>>> set, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to > > >>>> >>> multiple tables? This is a very important > > >>>> >>>>> use > > >>>> >>>>>>>>>>> case > > >>>> >>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early > > >>>> >>> days of Flink SQL, we were discussing > > >>>> >>>>>>>>>>> `SELECT > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 > > >>>> >>> MINUTES`. Your proposal seems to rephrase > > >>>> >>>>>>>>>>> STREAM > > >>>> >>>>>>>>>>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other > > >>>> >>> keywords DYNAMIC TABLE and FRESHNESS. But > > >>>> >>>>> the > > >>>> >>>>>>>>>>> core > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is > > >>>> >>> still there. I'm wondering if we > > >>>> >>>>> should > > >>>> >>>>>>>>>>> widen > > >>>> >>>>>>>>>>>>>>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part > > >>>> >>> of this FLIP but a new FLIP) to follow > > >>>> >>>>>>>>> the > > >>>> >>>>>>>>>>>>>>>>>>>>>>> standard > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making > > >>>> >>> `SELECT * FROM t` bounded by default > > >>>> >>>>> and > > >>>> >>>>>>>>> use > > >>>> >>>>>>>>>>>>>>>>>>>>> new > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic > > >>>> >>> behavior. Flink 2.0 would be the > > >>>> >>>>> perfect > > >>>> >>>>>>>>>>> time > > >>>> >>>>>>>>>>>>>>>>>>>>> for > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would > > >>>> >>> require careful discussions. What do > > >>>> >>>>>>>>> you > > >>>> >>>>>>>>>>>>>>>>>>>>> think? > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 11.03.24 > > >>>> >>> 08:23, Ron liu wrote: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Dev > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > > >>>> >>> and I would like to start a discussion > > >>>> >>>>> about > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FLIP-435: > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Introduce a > > >>>> >>> New Dynamic Table for Simplifying Data > > >>>> >>>>>>>>>>>>> Pipelines. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This FLIP is > > >>>> >>> designed to simplify the development of > > >>>> >>>>>>>>> data > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> processing > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > > >>>> >>> With Dynamic Tables with uniform SQL > > >>>> >>>>>>>>> statements > > >>>> >>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness, > > >>>> >>> users can define batch and streaming > > >>>> >>>>>>>>>>>>>>>>>>>>> transformations > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data in the > > >>>> >>> same way, accelerate ETL pipeline > > >>>> >>>>>>>>> development, > > >>>> >>>>>>>>>>>>> and > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manage > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> task > > >>>> >>> scheduling automatically. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For more > > >>>> >>> details, see FLIP-435 [1]. Looking forward to > > >>>> >>>>>>>>> your > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feedback. > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln & Ron