Thanks Ron and Timo for your proposal! Here is my ranking:
1. Derived table -> extend the persistent semantics of derived table in SQL standard, with a strong association with query, and has industry precedents such as Google Looker. 2. Live Table -> an alternative for 'dynamic table' 3. Materialized Table -> combination of the Materialized View and Table, but still a table which accept data changes 4. Materialized View -> need to extend understanding of the view to accept data changes The reason for not adding 'Refresh Table' is I don't want to tell the user to 'refresh a refresh table'. Best, Lincoln Lee Ron liu <ron9....@gmail.com> 于2024年4月9日周二 20:11写道: > Hi, Dev > > My rankings are: > > 1. Derived Table > 2. Materialized Table > 3. Live Table > 4. Materialized View > > Best, > Ron > > > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 20:07写道: > > > Hi, Dev > > > > After several rounds of discussion, there is currently no consensus on > the > > name of the new concept. Timo has proposed that we decide the name > through > > a vote. This is a good solution when there is no clear preference, so we > > will adopt this approach. > > > > Regarding the name of the new concept, there are currently five > candidates: > > 1. Derived Table -> taken by SQL standard > > 2. Materialized Table -> similar to SQL materialized view but a table > > 3. Live Table -> similar to dynamic tables > > 4. Refresh Table -> states what it does > > 5. Materialized View -> needs to extend the standard to support modifying > > data > > > > For the above five candidates, everyone can give your rankings based on > > your preferences. You can choose up to five options or only choose some > of > > them. > > We will use a scoring rule, where the* first rank gets 5 points, second > > rank gets 4 points, third rank gets 3 points, fourth rank gets 2 points, > > and fifth rank gets 1 point*. > > After the voting closes, I will score all the candidates based on > > everyone's votes, and the candidate with the highest score will be chosen > > as the name for the new concept. > > > > The voting will last up to 72 hours and is expected to close this Friday. > > I look forward to everyone voting on the name in this thread. Of course, > we > > also welcome new input regarding the name. > > > > Best, > > Ron > > > > Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:49写道: > > > >> Hi, Dev > >> > >> Sorry for my previous statement was not quite accurate. We will hold a > >> vote for the name within this thread. > >> > >> Best, > >> Ron > >> > >> > >> Ron liu <ron9....@gmail.com> 于2024年4月9日周二 19:29写道: > >> > >>> Hi, Timo > >>> > >>> Thanks for your reply. > >>> > >>> I agree with you that sometimes naming is more difficult. When no one > >>> has a clear preference, voting on the name is a good solution, so I'll > send > >>> a separate email for the vote, clarify the rules for the vote, then let > >>> everyone vote. > >>> > >>> One other point to confirm, in your ranking there is an option for > >>> Materialized View, does it stand for the UPDATING Materialized View > that > >>> you mentioned earlier in the discussion? If using Materialized View I > think > >>> it is needed to extend it. > >>> > >>> Best, > >>> Ron > >>> > >>> Timo Walther <twal...@apache.org> 于2024年4月9日周二 17:20写道: > >>> > >>>> Hi Ron, > >>>> > >>>> yes naming is hard. But it will have large impact on trainings, > >>>> presentations, and the mental model of users. Maybe the easiest is to > >>>> collect ranking by everyone with some short justification: > >>>> > >>>> > >>>> My ranking (from good to not so good): > >>>> > >>>> 1. Refresh Table -> states what it does > >>>> 2. Materialized Table -> similar to SQL materialized view but a table > >>>> 3. Live Table -> nice buzzword, but maybe still too close to dynamic > >>>> tables? > >>>> 4. Materialized View -> a bit broader than standard but still very > >>>> similar > >>>> 5. Derived table -> taken by standard > >>>> > >>>> Regards, > >>>> Timo > >>>> > >>>> > >>>> > >>>> On 07.04.24 11:34, Ron liu wrote: > >>>> > Hi, Dev > >>>> > > >>>> > This is a summary letter. After several rounds of discussion, there > >>>> is a > >>>> > strong consensus about the FLIP proposal and the issues it aims to > >>>> address. > >>>> > The current point of disagreement is the naming of the new concept. > I > >>>> have > >>>> > summarized the candidates as follows: > >>>> > > >>>> > 1. Derived Table (Inspired by Google Lookers) > >>>> > - Pros: Google Lookers has introduced this concept, which is > >>>> designed > >>>> > for building Looker's automated modeling, aligning with our purpose > >>>> for the > >>>> > stream-batch automatic pipeline. > >>>> > > >>>> > - Cons: The SQL standard uses derived table term extensively, > >>>> vendors > >>>> > adopt this for simply referring to a table within a subclause. > >>>> > > >>>> > 2. Materialized Table: It means materialize the query result to > table, > >>>> > similar to Db2 MQT (Materialized Query Tables). In addition, > Snowflake > >>>> > Dynamic Table's predecessor is also called Materialized Table. > >>>> > > >>>> > 3. Updating Table (From Timo) > >>>> > > >>>> > 4. Updating Materialized View (From Timo) > >>>> > > >>>> > 5. Refresh/Live Table (From Martijn) > >>>> > > >>>> > As Martijn said, naming is a headache, looking forward to more > >>>> valuable > >>>> > input from everyone. > >>>> > > >>>> > [1] > >>>> > > >>>> > https://cloud.google.com/looker/docs/derived-tables#persistent_derived_tables > >>>> > [2] > >>>> https://www.ibm.com/docs/en/db2/11.5?topic=tables-materialized-query > >>>> > [3] > >>>> > > >>>> > https://community.denodo.com/docs/html/browse/6.0/vdp/vql/materialized_tables/creating_materialized_tables/creating_materialized_tables > >>>> > > >>>> > Best, > >>>> > Ron > >>>> > > >>>> > Ron liu <ron9....@gmail.com> 于2024年4月7日周日 15:55写道: > >>>> > > >>>> >> Hi, Lorenzo > >>>> >> > >>>> >> Thank you for your insightful input. > >>>> >> > >>>> >>>>> I think the 2 above twisted the materialized view concept to > more > >>>> than > >>>> >> just an optimization for accessing pre-computed aggregates/filters. > >>>> >> I think that concept (at least in my mind) is now adherent to the > >>>> >> semantics of the words themselves ("materialized" and "view") than > >>>> on its > >>>> >> implementations in DBMs, as just a view on raw data that, > hopefully, > >>>> is > >>>> >> constantly updated with fresh results. > >>>> >> That's why I understand Timo's et al. objections. > >>>> >> > >>>> >> Your understanding of Materialized Views is correct. However, in > our > >>>> >> scenario, an important feature is the support for Update & Delete > >>>> >> operations, which the current Materialized Views cannot fulfill. As > >>>> we > >>>> >> discussed with Timo before, if Materialized Views needs to support > >>>> data > >>>> >> modifications, it would require an extension of new keywords, such > as > >>>> >> CREATING xxx (UPDATING) MATERIALIZED VIEW. > >>>> >> > >>>> >>>>> Still, I don't understand why we need another type of special > >>>> table. > >>>> >> Could you dive deep into the reasons why not simply adding the > >>>> FRESHNESS > >>>> >> parameter to standard tables? > >>>> >> > >>>> >> Firstly, I need to emphasize that we cannot achieve the design goal > >>>> of > >>>> >> FLIP through the CREATE TABLE syntax combined with a FRESHNESS > >>>> parameter. > >>>> >> The proposal of this FLIP is to use Dynamic Table + Continuous > >>>> Query, and > >>>> >> combine it with FRESHNESS to realize a streaming-batch unification. > >>>> >> However, CREATE TABLE is merely a metadata operation and cannot > >>>> >> automatically start a background refresh job. To achieve the design > >>>> goal of > >>>> >> FLIP with standard tables, it would require extending the CTAS[1] > >>>> syntax to > >>>> >> introduce the FRESHNESS keyword. We considered this design > >>>> initially, but > >>>> >> it has following problems: > >>>> >> > >>>> >> 1. Distinguishing a table created through CTAS as a standard table > >>>> or as a > >>>> >> "special" standard table with an ongoing background refresh job > >>>> using the > >>>> >> FRESHNESS keyword is very obscure for users. > >>>> >> 2. It intrudes on the semantics of the CTAS syntax. Currently, > tables > >>>> >> created using CTAS only add table metadata to the Catalog and do > not > >>>> record > >>>> >> attributes such as query. There are also no ongoing background > >>>> refresh > >>>> >> jobs, and the data writing operation happens only once at table > >>>> creation. > >>>> >> 3. For the framework, when we perform a certain kind of Alter Table > >>>> >> behavior for a table, for the table created by specifying FRESHNESS > >>>> and did > >>>> >> not specify the FRESHNESS created table behavior how to distinguish > >>>> , which > >>>> >> will also cause confusion. > >>>> >> > >>>> >> In terms of the design goal of combining Dynamic Table + Continuous > >>>> Query, > >>>> >> the FLIP proposal cannot be realized by only extending the current > >>>> stardand > >>>> >> tables, so a new kind of dynamic table needs to be introduced at > the > >>>> >> first-level concept. > >>>> >> > >>>> >> [1] > >>>> >> > >>>> > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#as-select_statement > >>>> >> > >>>> >> Best, > >>>> >> Ron > >>>> >> > >>>> >> <lorenzo.affe...@ververica.com.invalid> 于2024年4月3日周三 22:25写道: > >>>> >> > >>>> >>> Hello everybody! > >>>> >>> Thanks for the FLIP as it looks amazing (and I think the prove is > >>>> this > >>>> >>> deep discussion it is provoking :)) > >>>> >>> > >>>> >>> I have a couple of comments to add to this: > >>>> >>> > >>>> >>> Even though I get the reason why you rejected MATERIALIZED VIEW, I > >>>> still > >>>> >>> like it a lot, and I would like to provide pointers on how the > >>>> materialized > >>>> >>> view concept twisted in last years: > >>>> >>> > >>>> >>> • Materialize DB (https://materialize.com/) > >>>> >>> • The famous talk by Martin Kleppmann "turning the database inside > >>>> out" ( > >>>> >>> https://www.youtube.com/watch?v=fU9hR3kiOK0) > >>>> >>> > >>>> >>> I think the 2 above twisted the materialized view concept to more > >>>> than > >>>> >>> just an optimization for accessing pre-computed > aggregates/filters. > >>>> >>> I think that concept (at least in my mind) is now adherent to the > >>>> >>> semantics of the words themselves ("materialized" and "view") than > >>>> on its > >>>> >>> implementations in DBMs, as just a view on raw data that, > >>>> hopefully, is > >>>> >>> constantly updated with fresh results. > >>>> >>> That's why I understand Timo's et al. objections. > >>>> >>> Still I understand there is no need to add confusion :) > >>>> >>> > >>>> >>> Still, I don't understand why we need another type of special > table. > >>>> >>> Could you dive deep into the reasons why not simply adding the > >>>> FRESHNESS > >>>> >>> parameter to standard tables? > >>>> >>> > >>>> >>> I would say that as a very seamless implementation with the goal > of > >>>> a > >>>> >>> unification of batch and streaming. > >>>> >>> If we stick to a unified world, I think that Flink should just > >>>> provide 1 > >>>> >>> type of table that is inherently dynamic. > >>>> >>> Now, depending on FRESHNESS objectives / connectors used in WITH, > >>>> that > >>>> >>> table can be backed by a stream or batch job as you explained in > >>>> your FLIP. > >>>> >>> > >>>> >>> Maybe I am totally missing the point :) > >>>> >>> > >>>> >>> Thank you in advance, > >>>> >>> Lorenzo > >>>> >>> On Apr 3, 2024 at 15:25 +0200, Martijn Visser < > >>>> martijnvis...@apache.org>, > >>>> >>> wrote: > >>>> >>>> Hi all, > >>>> >>>> > >>>> >>>> Thanks for the proposal. While the FLIP talks extensively on how > >>>> >>> Snowflake > >>>> >>>> has Dynamic Tables and Databricks has Delta Live Tables, my > >>>> >>> understanding > >>>> >>>> is that Databricks has CREATE STREAMING TABLE [1] which relates > >>>> with > >>>> >>> this > >>>> >>>> proposal. > >>>> >>>> > >>>> >>>> I do have concerns about using CREATE DYNAMIC TABLE, specifically > >>>> about > >>>> >>>> confusing the users who are familiar with Snowflake's approach > >>>> where you > >>>> >>>> can't change the content via DML statements, while that is > >>>> something > >>>> >>> that > >>>> >>>> would work in this proposal. Naming is hard of course, but I > would > >>>> >>> probably > >>>> >>>> prefer something like CREATE CONTINUOUS TABLE, CREATE REFRESH > >>>> TABLE or > >>>> >>>> CREATE LIVE TABLE. > >>>> >>>> > >>>> >>>> Best regards, > >>>> >>>> > >>>> >>>> Martijn > >>>> >>>> > >>>> >>>> [1] > >>>> >>>> > >>>> >>> > >>>> > https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-ddl-create-streaming-table.html > >>>> >>>> > >>>> >>>> On Wed, Apr 3, 2024 at 5:19 AM Ron liu <ron9....@gmail.com> > wrote: > >>>> >>>> > >>>> >>>>> Hi, dev > >>>> >>>>> > >>>> >>>>> After offline discussion with Becket Qin, Lincoln Lee and Jark > >>>> Wu, we > >>>> >>> have > >>>> >>>>> improved some parts of the FLIP. > >>>> >>>>> > >>>> >>>>> 1. Add Full Refresh Mode section to clarify the semantics of > full > >>>> >>> refresh > >>>> >>>>> mode. > >>>> >>>>> 2. Add Future Improvement section explaining why query statement > >>>> does > >>>> >>> not > >>>> >>>>> support references to temporary view and possible solutions. > >>>> >>>>> 3. The Future Improvement section explains a possible future > >>>> solution > >>>> >>> for > >>>> >>>>> dynamic table to support the modification of query statements to > >>>> meet > >>>> >>> the > >>>> >>>>> common field-level schema evolution requirements of the > lakehouse. > >>>> >>>>> 4. The Refresh section emphasizes that the Refresh command and > the > >>>> >>>>> background refresh job can be executed in parallel, with no > >>>> >>> restrictions at > >>>> >>>>> the framework level. > >>>> >>>>> 5. Convert RefreshHandler into a plug-in interface to support > >>>> various > >>>> >>>>> workflow schedulers. > >>>> >>>>> > >>>> >>>>> Best, > >>>> >>>>> Ron > >>>> >>>>> > >>>> >>>>> Ron liu <ron9....@gmail.com> 于2024年4月2日周二 10:28写道: > >>>> >>>>> > >>>> >>>>>>> Hi, Venkata krishnan > >>>> >>>>>>> > >>>> >>>>>>> Thank you for your involvement and suggestions, and hope that > >>>> the > >>>> >>> design > >>>> >>>>>>> goals of this FLIP will be helpful to your business. > >>>> >>>>>>> > >>>> >>>>>>>>>>>>> 1. In the proposed FLIP, given the example for the > >>>> >>> dynamic table, do > >>>> >>>>>>> the > >>>> >>>>>>> data sources always come from a single lake storage such as > >>>> >>> Paimon or > >>>> >>>>> does > >>>> >>>>>>> the same proposal solve for 2 disparate storage systems like > >>>> >>> Kafka and > >>>> >>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to > >>>> Paimon? > >>>> >>>>>>> Basically the lambda architecture that is mentioned in the > FLIP > >>>> >>> as well. > >>>> >>>>>>> I'm wondering if it is possible to switch b/w sources based on > >>>> the > >>>> >>>>>>> execution mode, for eg: if it is backfill operation, switch > to a > >>>> >>> data > >>>> >>>>> lake > >>>> >>>>>>> storage system like Iceberg, otherwise an event streaming > system > >>>> >>> like > >>>> >>>>>>> Kafka. > >>>> >>>>>>> > >>>> >>>>>>> Dynamic table is a design abstraction at the framework level > and > >>>> >>> is not > >>>> >>>>>>> tied to the physical implementation of the connector. If a > >>>> >>> connector > >>>> >>>>>>> supports a combination of Kafka and lake storage, this works > >>>> fine. > >>>> >>>>>>> > >>>> >>>>>>>>>>>>> 2. What happens in the context of a bootstrap (batch) + > >>>> >>> nearline > >>>> >>>>> update > >>>> >>>>>>> (streaming) case that are stateful applications? What I mean > by > >>>> >>> that is, > >>>> >>>>>>> will the state from the batch application be transferred to > the > >>>> >>> nearline > >>>> >>>>>>> application after the bootstrap execution is complete? > >>>> >>>>>>> > >>>> >>>>>>> I think this is another orthogonal thing, something that > >>>> FLIP-327 > >>>> >>> tries > >>>> >>>>> to > >>>> >>>>>>> address, not directly related to Dynamic Table. > >>>> >>>>>>> > >>>> >>>>>>> [1] > >>>> >>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-327%3A+Support+switching+from+batch+to+stream+mode+to+improve+throughput+when+processing+backlog+data > >>>> >>>>>>> > >>>> >>>>>>> Best, > >>>> >>>>>>> Ron > >>>> >>>>>>> > >>>> >>>>>>> Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 于2024年3月30日周六 > >>>> >>> 07:06写道: > >>>> >>>>>>> > >>>> >>>>>>>>> Ron and Lincoln, > >>>> >>>>>>>>> > >>>> >>>>>>>>> Great proposal and interesting discussion for adding support > >>>> >>> for dynamic > >>>> >>>>>>>>> tables within Flink. > >>>> >>>>>>>>> > >>>> >>>>>>>>> At LinkedIn, we are also trying to solve compute/storage > >>>> >>> convergence for > >>>> >>>>>>>>> similar problems discussed as part of this FLIP, > specifically > >>>> >>> periodic > >>>> >>>>>>>>> backfill, bootstrap + nearline update use cases using single > >>>> >>>>>>>>> implementation > >>>> >>>>>>>>> of business logic (single script). > >>>> >>>>>>>>> > >>>> >>>>>>>>> Few clarifying questions: > >>>> >>>>>>>>> > >>>> >>>>>>>>> 1. In the proposed FLIP, given the example for the dynamic > >>>> >>> table, do the > >>>> >>>>>>>>> data sources always come from a single lake storage such as > >>>> >>> Paimon or > >>>> >>>>> does > >>>> >>>>>>>>> the same proposal solve for 2 disparate storage systems like > >>>> >>> Kafka and > >>>> >>>>>>>>> Iceberg where Kafka events are ETLed to Iceberg similar to > >>>> >>> Paimon? > >>>> >>>>>>>>> Basically the lambda architecture that is mentioned in the > >>>> >>> FLIP as well. > >>>> >>>>>>>>> I'm wondering if it is possible to switch b/w sources based > on > >>>> >>> the > >>>> >>>>>>>>> execution mode, for eg: if it is backfill operation, switch > to > >>>> >>> a data > >>>> >>>>> lake > >>>> >>>>>>>>> storage system like Iceberg, otherwise an event streaming > >>>> >>> system like > >>>> >>>>>>>>> Kafka. > >>>> >>>>>>>>> 2. What happens in the context of a bootstrap (batch) + > >>>> >>> nearline update > >>>> >>>>>>>>> (streaming) case that are stateful applications? What I mean > >>>> >>> by that is, > >>>> >>>>>>>>> will the state from the batch application be transferred to > >>>> >>> the nearline > >>>> >>>>>>>>> application after the bootstrap execution is complete? > >>>> >>>>>>>>> > >>>> >>>>>>>>> Regards > >>>> >>>>>>>>> Venkata krishnan > >>>> >>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>>>> On Mon, Mar 25, 2024 at 8:03 PM Ron liu <ron9....@gmail.com > > > >>>> >>> wrote: > >>>> >>>>>>>>> > >>>> >>>>>>>>>>> Hi, Timo > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> Thanks for your quick response, and your suggestion. > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> Yes, this discussion has turned into confirming whether > >>>> >>> it's a special > >>>> >>>>>>>>>>> table or a special MV. > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> 1. The key problem with MVs is that they don't support > >>>> >>> modification, > >>>> >>>>> so > >>>> >>>>>>>>> I > >>>> >>>>>>>>>>> prefer it to be a special table. Although the periodic > >>>> >>> refresh > >>>> >>>>> behavior > >>>> >>>>>>>>> is > >>>> >>>>>>>>>>> more characteristic of an MV, since we are already a > >>>> >>> special table, > >>>> >>>>>>>>>>> supporting periodic refresh behavior is quite natural, > >>>> >>> similar to > >>>> >>>>>>>>> Snowflake > >>>> >>>>>>>>>>> dynamic tables. > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> 2. Regarding the keyword UPDATING, since the current > >>>> >>> Regular Table is > >>>> >>>>> a > >>>> >>>>>>>>>>> Dynamic Table, which implies support for updating through > >>>> >>> Continuous > >>>> >>>>>>>>> Query, > >>>> >>>>>>>>>>> I think it is redundant to add the keyword UPDATING. In > >>>> >>> addition, > >>>> >>>>>>>>> UPDATING > >>>> >>>>>>>>>>> can not reflect the Continuous Query part, can not express > >>>> >>> the purpose > >>>> >>>>>>>>> we > >>>> >>>>>>>>>>> want to simplify the data pipeline through Dynamic Table + > >>>> >>> Continuous > >>>> >>>>>>>>>>> Query. > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> 3. From the perspective of the SQL standard definition, I > >>>> >>> can > >>>> >>>>> understand > >>>> >>>>>>>>>>> your concerns about Derived Table, but is it possible to > >>>> >>> make a slight > >>>> >>>>>>>>>>> adjustment to meet our needs? Additionally, as Lincoln > >>>> >>> mentioned, the > >>>> >>>>>>>>>>> Google Looker platform has introduced Persistent Derived > >>>> >>> Table, and > >>>> >>>>>>>>> there > >>>> >>>>>>>>>>> are precedents in the industry; could Derived Table be a > >>>> >>> candidate? > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> Of course, look forward to your better suggestions. > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> Best, > >>>> >>>>>>>>>>> Ron > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>> Timo Walther <twal...@apache.org> 于2024年3月25日周一 18:49写道: > >>>> >>>>>>>>>>> > >>>> >>>>>>>>>>>>> After thinking about this more, this discussion boils > >>>> >>> down to > >>>> >>>>> whether > >>>> >>>>>>>>>>>>> this is a special table or a special materialized > >>>> >>> view. In both > >>>> >>>>> cases, > >>>> >>>>>>>>>>>>> we would need to add a special keyword: > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> Either > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> CREATE UPDATING TABLE > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> or > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> CREATE UPDATING MATERIALIZED VIEW > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> I still feel that the periodic refreshing behavior is > >>>> >>> closer to a > >>>> >>>>> MV. > >>>> >>>>>>>>> If > >>>> >>>>>>>>>>>>> we add a special keyword to MV, the optimizer would > >>>> >>> know that the > >>>> >>>>> data > >>>> >>>>>>>>>>>>> cannot be used for query optimizations. > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> I will ask more people for their opinion. > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> Regards, > >>>> >>>>>>>>>>>>> Timo > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> On 25.03.24 10:45, Timo Walther wrote: > >>>> >>>>>>>>>>>>>>> Hi Ron and Lincoln, > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> thanks for the quick response and the very > >>>> >>> insightful discussion. > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> we might limit future opportunities to > >>>> >>> optimize queries > >>>> >>>>>>>>>>>>>>>>> through automatic materialization rewriting by > >>>> >>> allowing data > >>>> >>>>>>>>>>>>>>>>> modifications, thus losing the potential for > >>>> >>> such > >>>> >>>>> optimizations. > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> This argument makes a lot of sense to me. Due to > >>>> >>> the updates, the > >>>> >>>>>>>>>>> system > >>>> >>>>>>>>>>>>>>> is not in full control of the persisted data. > >>>> >>> However, the system > >>>> >>>>> is > >>>> >>>>>>>>>>>>>>> still in full control of the job that powers the > >>>> >>> refresh. So if > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>> system manages all updating pipelines, it could > >>>> >>> still leverage > >>>> >>>>>>>>>>> automatic > >>>> >>>>>>>>>>>>>>> materialization rewriting but without leveraging > >>>> >>> the data at rest > >>>> >>>>>>>>> (only > >>>> >>>>>>>>>>>>>>> the data in flight). > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> we are considering another candidate, Derived > >>>> >>> Table, the term > >>>> >>>>>>>>>>> 'derive' > >>>> >>>>>>>>>>>>>>>>> suggests a query, and 'table' retains > >>>> >>> modifiability. This > >>>> >>>>>>>>> approach > >>>> >>>>>>>>>>>>>>>>> would not disrupt our current concept of a > >>>> >>> dynamic table > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> I did some research on this term. The SQL standard > >>>> >>> uses the term > >>>> >>>>>>>>>>>>>>> "derived table" extensively (defined in section > >>>> >>> 4.17.3). Thus, a > >>>> >>>>>>>>> lot of > >>>> >>>>>>>>>>>>>>> vendors adopt this for simply referring to a table > >>>> >>> within a > >>>> >>>>>>>>> subclause: > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://dev.mysql.com/doc/refman/8.0/en/derived-tables.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghdiMp$ > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://infocenter.sybase.com/help/topic/com.sybase.infocenter.dc32300.1600/doc/html/san1390612291252.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737h1gRux$ > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://www.c-sharpcorner.com/article/derived-tables-vs-common-table-expressions/__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739bWIEcL$ > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://stackoverflow.com/questions/26529804/what-are-the-derived-tables-in-my-explain-statement__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739HnGtQf$ > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://www.sqlservercentral.com/articles/sql-derived-tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j737DeBiqg$ > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> Esp. the latter example is interesting, SQL Server > >>>> >>> allows things > >>>> >>>>>>>>> like > >>>> >>>>>>>>>>>>>>> this on derived tables: > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> UPDATE T SET Name='Timo' FROM (SELECT * FROM > >>>> >>> Product) AS T > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> SELECT * FROM Product; > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> Btw also Snowflake's dynamic table state: > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Because the content of a dynamic table is > >>>> >>> fully determined > >>>> >>>>>>>>>>>>>>>>> by the given query, the content cannot be > >>>> >>> changed by using DML. > >>>> >>>>>>>>>>>>>>>>> You don’t insert, update, or delete the rows > >>>> >>> in a dynamic > >>>> >>>>> table. > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> So a new term makes a lot of sense. > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> How about using `UPDATING`? > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> CREATE UPDATING TABLE > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> This reflects that modifications can be made and > >>>> >>> from an > >>>> >>>>>>>>>>>>>>> English-language perspective you can PAUSE or > >>>> >>> RESUME the UPDATING. > >>>> >>>>>>>>>>>>>>> Thus, a user can define UPDATING interval and mode? > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> Looking forward to your thoughts. > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> Regards, > >>>> >>>>>>>>>>>>>>> Timo > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> On 25.03.24 07:09, Ron liu wrote: > >>>> >>>>>>>>>>>>>>>>> Hi, Ahmed > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Thanks for your feedback. > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Regarding your question: > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> I want to iterate on Timo's comments > >>>> >>> regarding the confusion > >>>> >>>>>>>>> between > >>>> >>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink "Table". > >>>> >>> Should the refactoring > >>>> >>>>>>>>> of > >>>> >>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it in > >>>> >>> this Flip ( as the > >>>> >>>>>>>>>>>>>>>>> suggestions > >>>> >>>>>>>>>>>>>>>>> in the thread ) and address the holistic > >>>> >>> changes in a separate > >>>> >>>>> Flip > >>>> >>>>>>>>>>>>>>>>> for 2.0? > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Lincoln proposed a new concept in reply to > >>>> >>> Timo: Derived Table, > >>>> >>>>>>>>> which > >>>> >>>>>>>>>>>>>>>>> is a > >>>> >>>>>>>>>>>>>>>>> combination of Dynamic Table + Continuous > >>>> >>> Query, and the use of > >>>> >>>>>>>>>>> Derived > >>>> >>>>>>>>>>>>>>>>> Table will not conflict with existing concepts, > >>>> >>> what do you > >>>> >>>>> think? > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> I feel confused with how it is further with > >>>> >>> other components, > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>>>> examples provided feel like a standalone ETL > >>>> >>> job, could you > >>>> >>>>>>>>> provide in > >>>> >>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>> FLIP an example where the table is further used > >>>> >>> in subsequent > >>>> >>>>>>>>> queries > >>>> >>>>>>>>>>>>>>>>> (specially in batch mode). > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Thanks for your suggestion, I added how to use > >>>> >>> Dynamic Table in > >>>> >>>>>>>>> FLIP > >>>> >>>>>>>>>>>>> user > >>>> >>>>>>>>>>>>>>>>> story section, Dynamic Table can be referenced > >>>> >>> by downstream > >>>> >>>>>>>>> Dynamic > >>>> >>>>>>>>>>>>>>>>> Table > >>>> >>>>>>>>>>>>>>>>> and can also support OLAP queries. > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>> Ron > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> Ron liu <ron9....@gmail.com> 于2024年3月23日周六 > >>>> >>> 10:35写道: > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Hi, Feng > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Thanks for your feedback. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> Although currently we restrict users from > >>>> >>> modifying the query, > >>>> >>>>> I > >>>> >>>>>>>>>>>>> wonder > >>>> >>>>>>>>>>>>>>>>>>> if > >>>> >>>>>>>>>>>>>>>>>>> we can provide a better way to help users > >>>> >>> rebuild it without > >>>> >>>>>>>>>>> affecting > >>>> >>>>>>>>>>>>>>>>>>> downstream OLAP queries. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Considering the problem of data consistency, > >>>> >>> so in the first > >>>> >>>>> step > >>>> >>>>>>>>> we > >>>> >>>>>>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>> strictly limited in semantics and do not > >>>> >>> support modify the > >>>> >>>>> query. > >>>> >>>>>>>>>>>>>>>>>>> This is > >>>> >>>>>>>>>>>>>>>>>>> really a good problem, one of my ideas is to > >>>> >>> introduce a syntax > >>>> >>>>>>>>>>>>>>>>>>> similar to > >>>> >>>>>>>>>>>>>>>>>>> SWAP [1], which supports exchanging two > >>>> >>> Dynamic Tables. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> From the documentation, the definitions > >>>> >>> SQL and job > >>>> >>>>> information > >>>> >>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this mean that > >>>> >>> if a system needs to > >>>> >>>>>>>>> adapt > >>>> >>>>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to store > >>>> >>> Flink's job information > >>>> >>>>> in > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>> corresponding system? > >>>> >>>>>>>>>>>>>>>>>>> For example, does MySQL's Catalog need to > >>>> >>> store flink job > >>>> >>>>>>>>> information > >>>> >>>>>>>>>>>>> as > >>>> >>>>>>>>>>>>>>>>>>> well? > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Yes, currently we need to rely on Catalog to > >>>> >>> store refresh job > >>>> >>>>>>>>>>>>>>>>>>> information. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> Users still need to consider how much > >>>> >>> memory is being used, how > >>>> >>>>>>>>>>> large > >>>> >>>>>>>>>>>>>>>>>>> the concurrency is, which type of state > >>>> >>> backend is being used, > >>>> >>>>> and > >>>> >>>>>>>>>>>>>>>>>>> may need > >>>> >>>>>>>>>>>>>>>>>>> to set TTL expiration. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Similar to the current practice, job > >>>> >>> parameters can be set via > >>>> >>>>> the > >>>> >>>>>>>>>>>>> Flink > >>>> >>>>>>>>>>>>>>>>>>> conf or SET commands > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> When we submit a refresh command, can we > >>>> >>> help users detect if > >>>> >>>>>>>>> there > >>>> >>>>>>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>> any > >>>> >>>>>>>>>>>>>>>>>>> running jobs and automatically stop them > >>>> >>> before executing the > >>>> >>>>>>>>> refresh > >>>> >>>>>>>>>>>>>>>>>>> command? Then wait for it to complete before > >>>> >>> restarting the > >>>> >>>>>>>>>>> background > >>>> >>>>>>>>>>>>>>>>>>> streaming job? > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Purely from a technical implementation point > >>>> >>> of view, your > >>>> >>>>>>>>> proposal > >>>> >>>>>>>>>>> is > >>>> >>>>>>>>>>>>>>>>>>> doable, but it would be more costly. Also I > >>>> >>> think data > >>>> >>>>> consistency > >>>> >>>>>>>>>>>>>>>>>>> itself > >>>> >>>>>>>>>>>>>>>>>>> is the responsibility of the user, similar > >>>> >>> to how Regular Table > >>>> >>>>> is > >>>> >>>>>>>>>>>>>>>>>>> now also > >>>> >>>>>>>>>>>>>>>>>>> the responsibility of the user, so it's > >>>> >>> consistent with its > >>>> >>>>>>>>> behavior > >>>> >>>>>>>>>>>>>>>>>>> and no > >>>> >>>>>>>>>>>>>>>>>>> additional guarantees are made at the engine > >>>> >>> level. > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>> Ron > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> Ahmed Hamdy <hamdy10...@gmail.com> > >>>> >>> 于2024年3月22日周五 23:50写道: > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> Hi Ron, > >>>> >>>>>>>>>>>>>>>>>>>>> Sorry for joining the discussion late, > >>>> >>> thanks for the effort. > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> I think the base idea is great, however I > >>>> >>> have a couple of > >>>> >>>>>>>>> comments: > >>>> >>>>>>>>>>>>>>>>>>>>> - I want to iterate on Timo's comments > >>>> >>> regarding the confusion > >>>> >>>>>>>>>>> between > >>>> >>>>>>>>>>>>>>>>>>>>> "Dynamic Table" and current Flink > >>>> >>> "Table". Should the > >>>> >>>>>>>>> refactoring of > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>> system happen in 2.0, should we rename it > >>>> >>> in this Flip ( as the > >>>> >>>>>>>>>>>>>>>>>>>>> suggestions > >>>> >>>>>>>>>>>>>>>>>>>>> in the thread ) and address the holistic > >>>> >>> changes in a separate > >>>> >>>>>>>>> Flip > >>>> >>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>> 2.0? > >>>> >>>>>>>>>>>>>>>>>>>>> - I feel confused with how it is further > >>>> >>> with other components, > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>> examples provided feel like a standalone > >>>> >>> ETL job, could you > >>>> >>>>>>>>> provide > >>>> >>>>>>>>>>>>>>>>>>>>> in the > >>>> >>>>>>>>>>>>>>>>>>>>> FLIP an example where the table is > >>>> >>> further used in subsequent > >>>> >>>>>>>>>>> queries > >>>> >>>>>>>>>>>>>>>>>>>>> (specially in batch mode). > >>>> >>>>>>>>>>>>>>>>>>>>> - I really like the standard of keeping > >>>> >>> the unified batch and > >>>> >>>>>>>>>>>>> streaming > >>>> >>>>>>>>>>>>>>>>>>>>> approach > >>>> >>>>>>>>>>>>>>>>>>>>> Best Regards > >>>> >>>>>>>>>>>>>>>>>>>>> Ahmed Hamdy > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> On Fri, 22 Mar 2024 at 12:07, Lincoln Lee > >>>> >>> < > >>>> >>>>>>>>> lincoln.8...@gmail.com> > >>>> >>>>>>>>>>>>>>>>>>>>> wrote: > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Timo, > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for your thoughtful inputs! > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Yes, expanding the MATERIALIZED > >>>> >>> VIEW(MV) could achieve the > >>>> >>>>> same > >>>> >>>>>>>>>>>>>>>>>>>>> function, > >>>> >>>>>>>>>>>>>>>>>>>>>>> but our primary concern is that by > >>>> >>> using a view, we might > >>>> >>>>> limit > >>>> >>>>>>>>>>>>> future > >>>> >>>>>>>>>>>>>>>>>>>>>>> opportunities > >>>> >>>>>>>>>>>>>>>>>>>>>>> to optimize queries through automatic > >>>> >>> materialization > >>>> >>>>> rewriting > >>>> >>>>>>>>>>> [1], > >>>> >>>>>>>>>>>>>>>>>>>>>>> leveraging > >>>> >>>>>>>>>>>>>>>>>>>>>>> the support for MV by physical > >>>> >>> storage. This is because we > >>>> >>>>>>>>> would be > >>>> >>>>>>>>>>>>>>>>>>>>>>> breaking > >>>> >>>>>>>>>>>>>>>>>>>>>>> the intuitive semantics of a > >>>> >>> materialized view (a materialized > >>>> >>>>>>>>> view > >>>> >>>>>>>>>>>>>>>>>>>>>>> represents > >>>> >>>>>>>>>>>>>>>>>>>>>>> the result of a query) by allowing > >>>> >>> data modifications, thus > >>>> >>>>>>>>> losing > >>>> >>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>> potential > >>>> >>>>>>>>>>>>>>>>>>>>>>> for such optimizations. > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> With these considerations in mind, we > >>>> >>> were inspired by Google > >>>> >>>>>>>>>>>>> Looker's > >>>> >>>>>>>>>>>>>>>>>>>>>>> Persistent > >>>> >>>>>>>>>>>>>>>>>>>>>>> Derived Table [2]. PDT is designed for > >>>> >>> building Looker's > >>>> >>>>>>>>> automated > >>>> >>>>>>>>>>>>>>>>>>>>>>> modeling, > >>>> >>>>>>>>>>>>>>>>>>>>>>> aligning with our purpose for the > >>>> >>> stream-batch automatic > >>>> >>>>>>>>> pipeline. > >>>> >>>>>>>>>>>>>>>>>>>>>>> Therefore, > >>>> >>>>>>>>>>>>>>>>>>>>>>> we are considering another candidate, > >>>> >>> Derived Table, the term > >>>> >>>>>>>>>>>>> 'derive' > >>>> >>>>>>>>>>>>>>>>>>>>>>> suggests a > >>>> >>>>>>>>>>>>>>>>>>>>>>> query, and 'table' retains > >>>> >>> modifiability. This approach would > >>>> >>>>>>>>> not > >>>> >>>>>>>>>>>>>>>>>>>>> disrupt > >>>> >>>>>>>>>>>>>>>>>>>>>>> our current > >>>> >>>>>>>>>>>>>>>>>>>>>>> concept of a dynamic table, preserving > >>>> >>> the future utility of > >>>> >>>>>>>>> MVs. > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Conceptually, a Derived Table is a > >>>> >>> Dynamic Table + Continuous > >>>> >>>>>>>>>>>>>>>>>>>>>>> Query. By > >>>> >>>>>>>>>>>>>>>>>>>>>>> introducing > >>>> >>>>>>>>>>>>>>>>>>>>>>> a new concept Derived Table for this > >>>> >>> FLIP, this makes all > >>>> >>>>>>>>>>>>>>>>>>>>>>> concepts to > >>>> >>>>>>>>>>>>>>>>>>>>> play > >>>> >>>>>>>>>>>>>>>>>>>>>>> together nicely. > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> What do you think about this? > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> [1] > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://calcite.apache.org/docs/materialized_views.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73_NFf4D5$ > >>>> >>>>>>>>>>>>>>>>>>>>>>> [2] > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://cloud.google.com/looker/docs/derived-tables*persistent_derived_tables__;Iw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7382-2zI3$ > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> Timo Walther <twal...@apache.org> > >>>> >>> 于2024年3月22日周五 17:54写道: > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> thanks for the detailed answer. > >>>> >>> Sorry, for my late reply, we > >>>> >>>>>>>>> had a > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> conference that kept me busy. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the current concept[1], it > >>>> >>> actually includes: Dynamic > >>>> >>>>>>>>>>> Tables > >>>> >>>>>>>>>>>>> & > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> & Continuous Query. Dynamic > >>>> >>> Table is just an abstract > >>>> >>>>>>>>> logical > >>>> >>>>>>>>>>>>>>>>>>>>> concept > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This explanation makes sense to me. > >>>> >>> But the docs also say "A > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> query is evaluated on the dynamic > >>>> >>> table yielding a new > >>>> >>>>> dynamic > >>>> >>>>>>>>>>>>>>>>>>>>> table.". > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> So even our regular CREATE TABLEs > >>>> >>> are considered dynamic > >>>> >>>>>>>>> tables. > >>>> >>>>>>>>>>>>> This > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> can also be seen in the diagram > >>>> >>> "Dynamic Table -> Continuous > >>>> >>>>>>>>> Query > >>>> >>>>>>>>>>>>> -> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table". Currently, Flink > >>>> >>> queries can only be executed > >>>> >>>>>>>>> on > >>>> >>>>>>>>>>>>>>>>>>>>> Dynamic > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Tables. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In essence, a materialized view > >>>> >>> represents the result of > >>>> >>>>> a > >>>> >>>>>>>>>>>>> query. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Isn't that what your proposal does > >>>> >>> as well? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the object of the suspend > >>>> >>> operation is the refresh task > >>>> >>>>> of > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> dynamic table > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand that Snowflake uses > >>>> >>> the term [1] to merge their > >>>> >>>>>>>>>>>>> concepts > >>>> >>>>>>>>>>>>>>>>>>>>> of > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> STREAM, TASK, and TABLE into one > >>>> >>> piece of concept. But Flink > >>>> >>>>>>>>> has > >>>> >>>>>>>>>>> no > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concept of a "refresh task". Also, > >>>> >>> they already introduced > >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> VIEW. Flink is in the convenient > >>>> >>> position that the concept of > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> materialized views is not taken > >>>> >>> (reserved maybe for exactly > >>>> >>>>>>>>> this > >>>> >>>>>>>>>>> use > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> case?). And SQL standard concept > >>>> >>> could be "slightly adapted" > >>>> >>>>> to > >>>> >>>>>>>>>>> our > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> needs. Looking at other vendors > >>>> >>> like Postgres[2], they also > >>>> >>>>> use > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> `REFRESH` commands so why not > >>>> >>> adding additional commands such > >>>> >>>>>>>>> as > >>>> >>>>>>>>>>>>>>>>>>>>> DELETE > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> or UPDATE. Oracle supports "ON > >>>> >>> PREBUILT TABLE clause tells > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>> database > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> to use an existing table > >>>> >>> segment"[3] which comes closer to > >>>> >>>>>>>>> what we > >>>> >>>>>>>>>>>>>>>>>>>>> want > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> as well. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> it is not intended to support > >>>> >>> data modification > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> This is an argument that I > >>>> >>> understand. But we as Flink could > >>>> >>>>>>>>> allow > >>>> >>>>>>>>>>>>>>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> modifications. This way we are only > >>>> >>> extending the standard > >>>> >>>>> and > >>>> >>>>>>>>>>> don't > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> introduce new concepts. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> If we can't agree on using > >>>> >>> MATERIALIZED VIEW concept. We > >>>> >>>>> should > >>>> >>>>>>>>>>> fix > >>>> >>>>>>>>>>>>>>>>>>>>> our > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> syntax in a Flink 2.0 effort. > >>>> >>> Making regular tables bounded > >>>> >>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>> dynamic > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> tables unbounded. We would be > >>>> >>> closer to the SQL standard with > >>>> >>>>>>>>> this > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> pave the way for the future. I > >>>> >>> would actually support this if > >>>> >>>>>>>>> all > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> concepts play together nicely. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> In the future, we can consider > >>>> >>> extending the statement > >>>> >>>>> set > >>>> >>>>>>>>>>>>> syntax > >>>> >>>>>>>>>>>>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> support the creation of multiple > >>>> >>> dynamic tables. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> It's good that we called the > >>>> >>> concept STATEMENT SET. This > >>>> >>>>>>>>> allows us > >>>> >>>>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> defined CREATE TABLE within. Even > >>>> >>> if it might look a bit > >>>> >>>>>>>>>>> confusing. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Regards, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> Timo > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [2] > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://www.postgresql.org/docs/current/sql-creatematerializedview.html__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zbNhvS7$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> [3] > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://oracle-base.com/articles/misc/materialized-views__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j739xS1kvD$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> On 21.03.24 04:14, Feng Jin wrote: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Ron and Lincoln > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this > >>>> >>> discussion. I believe it will > >>>> >>>>> greatly > >>>> >>>>>>>>>>>>>>>>>>>>> improve > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> convenience of managing user > >>>> >>> real-time pipelines. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I have some questions. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding Limitations of > >>>> >>> Dynamic Table:* > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Does not support modifying > >>>> >>> the select statement after the > >>>> >>>>>>>>>>> dynamic > >>>> >>>>>>>>>>>>>>>>>>>>>>> table > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> is created. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Although currently we restrict > >>>> >>> users from modifying the > >>>> >>>>>>>>> query, I > >>>> >>>>>>>>>>>>>>>>>>>>> wonder > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> if > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> we can provide a better way to > >>>> >>> help users rebuild it without > >>>> >>>>>>>>>>>>>>>>>>>>> affecting > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> downstream OLAP queries. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the management of > >>>> >>> background jobs:* > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. From the documentation, the > >>>> >>> definitions SQL and job > >>>> >>>>>>>>>>> information > >>>> >>>>>>>>>>>>>>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> stored in the Catalog. Does this > >>>> >>> mean that if a system needs > >>>> >>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>> adapt > >>>> >>>>>>>>>>>>>>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Tables, it also needs to > >>>> >>> store Flink's job > >>>> >>>>>>>>> information in > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding system? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> For example, does MySQL's > >>>> >>> Catalog need to store flink job > >>>> >>>>>>>>>>>>>>>>>>>>> information > >>>> >>>>>>>>>>>>>>>>>>>>>>> as > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> well? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Users still need to consider > >>>> >>> how much memory is being > >>>> >>>>> used, > >>>> >>>>>>>>>>> how > >>>> >>>>>>>>>>>>>>>>>>>>>>> large > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> the concurrency is, which type > >>>> >>> of state backend is being > >>>> >>>>> used, > >>>> >>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>> may > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> need > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> to set TTL expiration. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> *Regarding the Refresh Part:* > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> If the refresh mode is > >>>> >>> continuous and a background job is > >>>> >>>>>>>>>>> running, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> caution should be taken with the > >>>> >>> refresh command as it can > >>>> >>>>>>>>> lead > >>>> >>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> inconsistent data. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> When we submit a refresh > >>>> >>> command, can we help users detect > >>>> >>>>> if > >>>> >>>>>>>>>>> there > >>>> >>>>>>>>>>>>>>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> any > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> running jobs and automatically > >>>> >>> stop them before executing > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>> refresh > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> command? Then wait for it to > >>>> >>> complete before restarting the > >>>> >>>>>>>>>>>>>>>>>>>>> background > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> streaming job? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Feng > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 19, 2024 at 9:40 PM > >>>> >>> Lincoln Lee < > >>>> >>>>>>>>>>>>> lincoln.8...@gmail.com > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Yun, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thank you very much for your > >>>> >>> valuable input! > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Incremental mode is indeed an > >>>> >>> attractive idea, we have also > >>>> >>>>>>>>>>>>>>>>>>>>> discussed > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, but in the current > >>>> >>> design, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we first provided two refresh > >>>> >>> modes: CONTINUOUS and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FULL. Incremental mode can be > >>>> >>> introduced > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> once the execution layer has > >>>> >>> the capability. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> My answer for the two > >>>> >>> questions: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, cascading is a good > >>>> >>> question. Current proposal > >>>> >>>>>>>>> provides a > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness that defines a > >>>> >>> dynamic > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table relative to the base > >>>> >>> table’s lag. If users need to > >>>> >>>>>>>>>>> consider > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> end-to-end freshness of > >>>> >>> multiple > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> cascaded dynamic tables, he > >>>> >>> can manually split them for > >>>> >>>>> now. > >>>> >>>>>>>>> Of > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> course, how to let multiple > >>>> >>> cascaded > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> or dependent dynamic tables > >>>> >>> complete the freshness > >>>> >>>>>>>>> definition > >>>> >>>>>>>>>>>>> in > >>>> >>>>>>>>>>>>>>>>>>>>> a > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> simpler way, I think it can be > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> extended in the future. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Cascading refresh is also a > >>>> >>> part we focus on discussing. In > >>>> >>>>>>>>> this > >>>> >>>>>>>>>>>>>>>>>>>>> flip, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we hope to focus as much as > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> possible on the core features > >>>> >>> (as it already involves a lot > >>>> >>>>>>>>>>>>>>>>>>>>> things), > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> so we did not directly > >>>> >>> introduce related > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax. However, based on the > >>>> >>> current design, combined > >>>> >>>>>>>>> with > >>>> >>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> catalog and lineage, > >>>> >>> theoretically, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can also finish the > >>>> >>> cascading refresh. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang <myas...@live.com> > >>>> >>> 于2024年3月19日周二 13:45写道: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this > >>>> >>> discussion, and I am so excited to > >>>> >>>>>>>>> see > >>>> >>>>>>>>>>>>>>>>>>>>> this > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> topic > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> being discussed in the > >>>> >>> Flink community! > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From my point of view, > >>>> >>> instead of the work of unifying > >>>> >>>>>>>>>>>>> streaming > >>>> >>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in DataStream API [1], > >>>> >>> this FLIP actually could make users > >>>> >>>>>>>>>>>>> benefit > >>>> >>>>>>>>>>>>>>>>>>>>>>> from > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> one > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> engine to rule batch & > >>>> >>> streaming. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we treat this FLIP as > >>>> >>> an open-source implementation of > >>>> >>>>>>>>>>>>>>>>>>>>> Snowflake's > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic tables [2], we > >>>> >>> still lack an incremental refresh > >>>> >>>>>>>>> mode > >>>> >>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>> make > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ETL near real-time with a > >>>> >>> much cheaper computation cost. > >>>> >>>>>>>>>>> However, > >>>> >>>>>>>>>>>>>>>>>>>>> I > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> think > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this could be done under > >>>> >>> the current design by introducing > >>>> >>>>>>>>>>>>> another > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mode in the future. > >>>> >>> Although the extra work of incremental > >>>> >>>>>>>>> view > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> maintenance > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be much larger. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the FLIP itself, I > >>>> >>> have several questions below: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. It seems this FLIP does > >>>> >>> not consider the lag of > >>>> >>>>> refreshes > >>>> >>>>>>>>>>>>>>>>>>>>> across > >>>> >>>>>>>>>>>>>>>>>>>>>>> ETL > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> layers from ODS ---> DWD > >>>> >>> ---> APP [3]. We currently only > >>>> >>>>>>>>>>> consider > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scheduler interval, which > >>>> >>> means we cannot use lag to > >>>> >>>>>>>>>>>>> automatically > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> schedule > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the upfront micro-batch > >>>> >>> jobs to do the work. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. To support the > >>>> >>> automagical refreshes, we should > >>>> >>>>> consider > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>> lineage > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the catalog or somewhere > >>>> >>> else. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-134*3A*Batch*execution*for*the*DataStream*API__;JSsrKysrKw!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j7352JICzI$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [2] > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-about__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73zexZBXu$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [3] > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://docs.snowflake.com/en/user-guide/dynamic-tables-refresh__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j735ghqpxk$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yun Tang > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>> ________________________________ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From: Lincoln Lee < > >>>> >>> lincoln.8...@gmail.com> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sent: Thursday, March 14, > >>>> >>> 2024 14:35 > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To: dev@flink.apache.org < > >>>> >>> dev@flink.apache.org> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Subject: Re: [DISCUSS] > >>>> >>> FLIP-435: Introduce a New Dynamic > >>>> >>>>>>>>> Table > >>>> >>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Simplifying Data Pipelines > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jing, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for your attention > >>>> >>> to this flip! I'll try to answer > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> following > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> questions. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define query > >>>> >>> of dynamic table? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or > >>>> >>> introducing new syntax? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, how > >>>> >>> to handle the difference in SQL > >>>> >>>>>>>>> between > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> streaming > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query > >>>> >>> including window aggregate based on > >>>> >>>>>>>>>>>>>>>>>>>>> processing > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including > >>>> >>> global order by? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Similar to `CREATE TABLE > >>>> >>> AS query`, here the `query` also > >>>> >>>>>>>>> uses > >>>> >>>>>>>>>>>>>>>>>>>>> Flink > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> sql > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> doesn't introduce a > >>>> >>> totally new syntax. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We will not change the > >>>> >>> status respect to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the difference in > >>>> >>> functionality of flink sql itself on > >>>> >>>>>>>>>>> streaming > >>>> >>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch, for example, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the proctime window agg on > >>>> >>> streaming and global sort on > >>>> >>>>>>>>> batch > >>>> >>>>>>>>>>>>> that > >>>> >>>>>>>>>>>>>>>>>>>>>>> you > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mentioned, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in fact, do not work > >>>> >>> properly in the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other mode, so when the > >>>> >>> user modifies the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> refresh mode of a dynamic > >>>> >>> table that is not supported, we > >>>> >>>>>>>>> will > >>>> >>>>>>>>>>>>>>>>>>>>> throw > >>>> >>>>>>>>>>>>>>>>>>>>>>> an > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify the > >>>> >>> query of dynamic table is allowed? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only > >>>> >>> refresh a dynamic table based on the > >>>> >>>>>>>>> initial > >>>> >>>>>>>>>>>>>>>>>>>>> query? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Yes, in the current > >>>> >>> design, the query definition of the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table is not > >>>> >>> allowed > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to be modified, and you > >>>> >>> can only refresh the data based > >>>> >>>>>>>>> on > >>>> >>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial definition. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use dynamic > >>>> >>> table? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table seems > >>>> >>> to be similar to the materialized > >>>> >>>>>>>>>>> view. > >>>> >>>>>>>>>>>>>>>>>>>>>>> Will > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like > >>>> >>> materialized view rewriting during the > >>>> >>>>>>>>>>>>>>>>>>>>> optimization? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It's true that dynamic > >>>> >>> table and materialized view > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are similar in some ways, > >>>> >>> but as Ron > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explains > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> there are differences. In > >>>> >>> terms of optimization, automated > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> materialization discovery > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> similar to that supported > >>>> >>> by calcite is also a potential > >>>> >>>>>>>>>>>>>>>>>>>>> possibility, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> perhaps with the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> addition of automated > >>>> >>> rewriting in the future. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron liu < > >>>> >>> ron9....@gmail.com> 于2024年3月14日周四 14:01写道: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Timo > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Sorry for later > >>>> >>> response, thanks for your feedback. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding your > >>>> >>> questions: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has introduced > >>>> >>> the concept of Dynamic Tables many > >>>> >>>>>>>>> years > >>>> >>>>>>>>>>>>>>>>>>>>> ago. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> How > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term "Dynamic > >>>> >>> Table" fit into Flink's regular > >>>> >>>>>>>>> tables > >>>> >>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>> also > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it relate to > >>>> >>> Table API? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that adding > >>>> >>> the DYNAMIC TABLE keyword could cause > >>>> >>>>>>>>>>>>>>>>>>>>> confusion > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a > >>>> >>> term for regular CREATE TABLE (that can > >>>> >>>>>>>>> be > >>>> >>>>>>>>>>>>>>>>>>>>> "kind > >>>> >>>>>>>>>>>>>>>>>>>>>>> of > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well and > >>>> >>> is backed by a changelog) is then > >>>> >>>>>>>>>>> missing. > >>>> >>>>>>>>>>>>>>>>>>>>>>> Also > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we call > >>>> >>> our connectors for those tables, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and DynamicTableSink. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I find > >>>> >>> it contradicting that a TABLE can be > >>>> >>>>>>>>>>>>>>>>>>>>> "paused" or > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From an > >>>> >>> English language perspective, this > >>>> >>>>> does > >>>> >>>>>>>>>>>>> sound > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my > >>>> >>> opinion (without much research yet), a > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger > >>>> >>> should rather be modelled as a CREATE > >>>> >>>>>>>>>>>>>>>>>>>>> MATERIALIZED > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are > >>>> >>> familiar with?) or a new concept such > >>>> >>>>> as > >>>> >>>>>>>>> a > >>>> >>>>>>>>>>>>>>>>>>>>> CREATE > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be paused > >>>> >>> and resumed?). > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In the current > >>>> >>> concept[1], it actually includes: Dynamic > >>>> >>>>>>>>>>> Tables > >>>> >>>>>>>>>>>>> & > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. > >>>> >>> Dynamic Table is just an abstract > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> logical concept > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , which in its physical > >>>> >>> form represents either a table > >>>> >>>>> or a > >>>> >>>>>>>>>>>>>>>>>>>>>>> changelog > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> stream. It requires the > >>>> >>> combination with Continuous Query > >>>> >>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>> achieve > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic updates of the > >>>> >>> target table similar to a > >>>> >>>>> database’s > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Materialized View. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We hope to upgrade the > >>>> >>> Dynamic Table to a real entity > >>>> >>>>> that > >>>> >>>>>>>>>>> users > >>>> >>>>>>>>>>>>>>>>>>>>> can > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operate, which combines > >>>> >>> the logical concepts of Dynamic > >>>> >>>>>>>>>>> Tables + > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Continuous Query. By > >>>> >>> integrating the definition of tables > >>>> >>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>> queries, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it can achieve > >>>> >>> functions similar to Materialized Views, > >>>> >>>>>>>>>>>>>>>>>>>>> simplifying > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users' data processing > >>>> >>> pipelines. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, the object of the > >>>> >>> suspend operation is the refresh > >>>> >>>>>>>>> task of > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic table. The > >>>> >>> command `ALTER DYNAMIC TABLE > >>>> >>>>> table_name > >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND > >>>> >>>>>>>>>>>>>>>>>>>>>>> ` > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is actually a shorthand > >>>> >>> for `ALTER DYNAMIC TABLE > >>>> >>>>> table_name > >>>> >>>>>>>>>>>>>>>>>>>>> SUSPEND > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> REFRESH` (if written in > >>>> >>> full for clarity, we can also > >>>> >>>>>>>>> modify > >>>> >>>>>>>>>>>>> it). > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Initially, we also > >>>> >>> considered Materialized Views > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> , but ultimately > >>>> >>> decided against them. Materialized views > >>>> >>>>>>>>> are > >>>> >>>>>>>>>>>>>>>>>>>>>>> designed > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to enhance query > >>>> >>> performance for workloads that consist > >>>> >>>>> of > >>>> >>>>>>>>>>>>>>>>>>>>> common, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> repetitive query > >>>> >>> patterns. In essence, a materialized > >>>> >>>>> view > >>>> >>>>>>>>>>>>>>>>>>>>>>> represents > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the result of a query. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, it is not > >>>> >>> intended to support data modification. > >>>> >>>>>>>>> For > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lakehouse scenarios, > >>>> >>> where the ability to delete or > >>>> >>>>> update > >>>> >>>>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>> is > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> crucial (such as > >>>> >>> compliance with GDPR, FLIP-2), > >>>> >>>>>>>>> materialized > >>>> >>>>>>>>>>>>>>>>>>>>> views > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fall short. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Compared to CREATE > >>>> >>> (regular) TABLE, CREATE DYNAMIC TABLE > >>>> >>>>>>>>> not > >>>> >>>>>>>>>>>>> only > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> defines metadata in the > >>>> >>> catalog but also automatically > >>>> >>>>>>>>>>> initiates > >>>> >>>>>>>>>>>>>>>>>>>>> a > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data refresh task based > >>>> >>> on the query specified during > >>>> >>>>> table > >>>> >>>>>>>>>>>>>>>>>>>>>>> creation. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It dynamically executes > >>>> >>> data updates. Users can focus on > >>>> >>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dependencies and data > >>>> >>> generation logic. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 4. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The new dynamic table > >>>> >>> does not conflict with the existing > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource and > >>>> >>> DynamicTableSink interfaces. For > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>> developer, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> all that needs to be > >>>> >>> implemented is the new > >>>> >>>>>>>>>>> CatalogDynamicTable, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> without changing the > >>>> >>> implementation of source and sink. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 5. For now, the FLIP > >>>> >>> does not consider supporting Table > >>>> >>>>> API > >>>> >>>>>>>>>>>>>>>>>>>>>>> operations > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> on > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dynamic Table > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> . However, once the SQL > >>>> >>> syntax is finalized, we can > >>>> >>>>> discuss > >>>> >>>>>>>>>>> this > >>>> >>>>>>>>>>>>>>>>>>>>> in > >>>> >>>>>>>>>>>>>>>>>>>>>>> a > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> separate FLIP. > >>>> >>> Currently, I have a rough idea: the Table > >>>> >>>>>>>>> API > >>>> >>>>>>>>>>>>>>>>>>>>> should > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also introduce > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTable operation > >>>> >>> interfaces > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> corresponding to the > >>>> >>> existing Table interfaces. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The TableEnvironment > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will provide relevant > >>>> >>> methods to support various > >>>> >>>>> dynamic > >>>> >>>>>>>>>>>>> table > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> operations. The goal > >>>> >>> for the new Dynamic Table is to > >>>> >>>>> offer > >>>> >>>>>>>>>>> users > >>>> >>>>>>>>>>>>>>>>>>>>> an > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> experience similar to > >>>> >>> using a database, which is why we > >>>> >>>>>>>>>>>>>>>>>>>>> prioritize > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SQL-based approaches > >>>> >>> initially. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you envision > >>>> >>> re-adding the functionality of a > >>>> >>>>>>>>>>> statement > >>>> >>>>>>>>>>>>>>>>>>>>> set, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to multiple > >>>> >>> tables? This is a very important > >>>> >>>>> use > >>>> >>>>>>>>>>> case > >>>> >>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Multi-tables is indeed > >>>> >>> a very important user scenario. In > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>> future, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we can consider > >>>> >>> extending the statement set syntax to > >>>> >>>>>>>>> support > >>>> >>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> creation of multiple > >>>> >>> dynamic tables. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early > >>>> >>> days of Flink SQL, we were discussing > >>>> >>>>>>>>>>> `SELECT > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 > >>>> >>> MINUTES`. Your proposal seems to rephrase > >>>> >>>>>>>>>>> STREAM > >>>> >>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other keywords > >>>> >>> DYNAMIC TABLE and FRESHNESS. But the > >>>> >>>>>>>>> core > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is > >>>> >>> still there. I'm wondering if we should > >>>> >>>>>>>>>>> widen > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part of > >>>> >>> this FLIP but a new FLIP) to follow > >>>> >>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>> standard > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making > >>>> >>> `SELECT * FROM t` bounded by default and > >>>> >>>>>>>>> use > >>>> >>>>>>>>>>>>> new > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic > >>>> >>> behavior. Flink 2.0 would be the perfect > >>>> >>>>>>>>> time > >>>> >>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would > >>>> >>> require careful discussions. What do > >>>> >>>>> you > >>>> >>>>>>>>>>>>>>>>>>>>> think? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The query part indeed > >>>> >>> requires a separate FLIP > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for discussion, as it > >>>> >>> involves changes to the default > >>>> >>>>>>>>>>> behavior. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>> > >>>> >>> > >>>> > https://urldefense.com/v3/__https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/concepts/dynamic_tables__;!!IKRxdwAv5BmarQ!dVYcp9PUyjpBGzkYFxb2sdnmB0E22koc-YLdxY2LidExEHUJKRkyvRbAveqjlYFKWevFvmE1Z-j73477_wHn$ > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Ron > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang < > >>>> >>> beyond1...@gmail.com> 于2024年3月13日周三 15:19写道: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Lincoln & Ron, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the > >>>> >>> proposal. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with the > >>>> >>> question raised by Timo. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Besides, I have some > >>>> >>> other questions. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. How to define > >>>> >>> query of dynamic table? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Use flink sql or > >>>> >>> introducing new syntax? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If use flink sql, > >>>> >>> how to handle the difference in SQL > >>>> >>>>>>>>> between > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> streaming > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> batch processing? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For example, a query > >>>> >>> including window aggregate based on > >>>> >>>>>>>>>>>>>>>>>>>>> processing > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> or a query including > >>>> >>> global order by? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Whether modify > >>>> >>> the query of dynamic table is allowed? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Or we could only > >>>> >>> refresh a dynamic table based on > >>>> >>>>> initial > >>>> >>>>>>>>>>>>> query? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. How to use > >>>> >>> dynamic table? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The dynamic table > >>>> >>> seems to be similar with materialized > >>>> >>>>>>>>> view. > >>>> >>>>>>>>>>>>>>>>>>>>> Will > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> we > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> do > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> something like > >>>> >>> materialized view rewriting during the > >>>> >>>>>>>>>>>>>>>>>>>>> optimization? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jing Zhang > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo Walther < > >>>> >>> twal...@apache.org> 于2024年3月13日周三 01:24写 > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 道: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Lincoln & Ron, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for > >>>> >>> proposing this FLIP. I think a design > >>>> >>>>> similar > >>>> >>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>> what > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> you > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> propose has been > >>>> >>> in the heads of many people, however, > >>>> >>>>>>>>> I'm > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wondering > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this will fit > >>>> >>> into the bigger picture. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I haven't deeply > >>>> >>> reviewed the FLIP yet, but would like > >>>> >>>>> to > >>>> >>>>>>>>>>> ask > >>>> >>>>>>>>>>>>>>>>>>>>> some > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> initial questions: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink has > >>>> >>> introduced the concept of Dynamic Tables many > >>>> >>>>>>>>>>> years > >>>> >>>>>>>>>>>>>>>>>>>>> ago. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does the term > >>>> >>> "Dynamic Table" fit into Flink's regular > >>>> >>>>>>>>>>> tables > >>>> >>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> also > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> how does it > >>>> >>> relate to Table API? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I fear that > >>>> >>> adding the DYNAMIC TABLE keyword could > >>>> >>>>> cause > >>>> >>>>>>>>>>>>>>>>>>>>> confusion > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users, because a > >>>> >>> term for regular CREATE TABLE (that > >>>> >>>>> can > >>>> >>>>>>>>> be > >>>> >>>>>>>>>>>>>>>>>>>>> "kind > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> of > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamic" as well > >>>> >>> and is backed by a changelog) is then > >>>> >>>>>>>>>>>>> missing. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> given that we > >>>> >>> call our connectors for those tables, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DynamicTableSource > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and > >>>> >>> DynamicTableSink. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, I > >>>> >>> find it contradicting that a TABLE can be > >>>> >>>>>>>>>>>>>>>>>>>>> "paused" > >>>> >>>>>>>>>>>>>>>>>>>>>>> or > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "resumed". From > >>>> >>> an English language perspective, this > >>>> >>>>>>>>> does > >>>> >>>>>>>>>>>>>>>>>>>>> sound > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> incorrect. In my > >>>> >>> opinion (without much research yet), a > >>>> >>>>>>>>>>>>>>>>>>>>> continuous > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> updating trigger > >>>> >>> should rather be modelled as a CREATE > >>>> >>>>>>>>>>>>>>>>>>>>>>> MATERIALIZED > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> VIEW > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (which users are > >>>> >>> familiar with?) or a new concept such > >>>> >>>>>>>>> as a > >>>> >>>>>>>>>>>>>>>>>>>>> CREATE > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> TASK > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (that can be > >>>> >>> paused and resumed?). > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> How do you > >>>> >>> envision re-adding the functionality of a > >>>> >>>>>>>>>>> statement > >>>> >>>>>>>>>>>>>>>>>>>>>>> set, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fans out to > >>>> >>> multiple tables? This is a very important > >>>> >>>>> use > >>>> >>>>>>>>>>> case > >>>> >>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Since the early > >>>> >>> days of Flink SQL, we were discussing > >>>> >>>>>>>>>>> `SELECT > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> STREAM > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> * > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> FROM T EMIT 5 > >>>> >>> MINUTES`. Your proposal seems to rephrase > >>>> >>>>>>>>>>> STREAM > >>>> >>>>>>>>>>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> EMIT, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> into other > >>>> >>> keywords DYNAMIC TABLE and FRESHNESS. But > >>>> >>>>> the > >>>> >>>>>>>>>>> core > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> functionality is > >>>> >>> still there. I'm wondering if we > >>>> >>>>> should > >>>> >>>>>>>>>>> widen > >>>> >>>>>>>>>>>>>>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scope > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> (maybe not part > >>>> >>> of this FLIP but a new FLIP) to follow > >>>> >>>>>>>>> the > >>>> >>>>>>>>>>>>>>>>>>>>>>> standard > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> more > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> closely. Making > >>>> >>> `SELECT * FROM t` bounded by default > >>>> >>>>> and > >>>> >>>>>>>>> use > >>>> >>>>>>>>>>>>>>>>>>>>> new > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> syntax > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for the dynamic > >>>> >>> behavior. Flink 2.0 would be the > >>>> >>>>> perfect > >>>> >>>>>>>>>>> time > >>>> >>>>>>>>>>>>>>>>>>>>> for > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> however, it would > >>>> >>> require careful discussions. What do > >>>> >>>>>>>>> you > >>>> >>>>>>>>>>>>>>>>>>>>> think? > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 11.03.24 > >>>> >>> 08:23, Ron liu wrote: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, Dev > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln Lee > >>>> >>> and I would like to start a discussion > >>>> >>>>> about > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> FLIP-435: > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Introduce a > >>>> >>> New Dynamic Table for Simplifying Data > >>>> >>>>>>>>>>>>> Pipelines. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This FLIP is > >>>> >>> designed to simplify the development of > >>>> >>>>>>>>> data > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> processing > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pipelines. > >>>> >>> With Dynamic Tables with uniform SQL > >>>> >>>>>>>>> statements > >>>> >>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> freshness, > >>>> >>> users can define batch and streaming > >>>> >>>>>>>>>>>>>>>>>>>>> transformations > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> to > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> data in the > >>>> >>> same way, accelerate ETL pipeline > >>>> >>>>>>>>> development, > >>>> >>>>>>>>>>>>> and > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manage > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> task > >>>> >>> scheduling automatically. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For more > >>>> >>> details, see FLIP-435 [1]. Looking forward to > >>>> >>>>>>>>> your > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feedback. > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Lincoln & Ron > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>>>> > >>>> >>>>>>>>>>> > >>>> >>>>>>>>> > >>>> >>>>>>> > >>>> >>>>> > >>>> >>> > >>>> >> > >>>> > > >>>> > >>>> >