Hi, Feng The reply looks good to me. But I have one question: You mentioned the `DESC MATERIALIZED TABLE` syntax in FLIP, but we didn't provide this syntax until now. I think we should add it to this FLIP if needed.
Best, Ron Feng Jin <jinfeng1...@gmail.com> 于2024年12月18日周三 16:52写道: > Hi Ron > > Thanks for your reply. > > > Is it only possible to add columns at the end and not anywhere in > table schema, some databases have this limitation, does lake storage such > as Iceberg/Paimon have this limitation? > > > Currently, we can restrict adding columns only to the end of the schema. > Although both Paimon and Iceberg already support adding columns anywhere, > there are still some systems that do not. I will include this in the FLIP. > > > > In the Refresh Task Behavior section you mention partition hints, is it > possible to clarify what it is in the FLIP? > > > I have added the relevant details. > > > > Are you able to articulate the default behavior? > > > The detailed explanation for this part has been updated. > > > > How users can determine if states are compatible? > > > Users can only rely on their experience to make modifications. Currently, > the Flink framework does not guarantee that changes to SQL logic will > maintain state compatibility. > > I think we can add some suggestions in the user documentation in the > future. While the framework itself cannot ensure state compatibility, some > simple modification scenarios can indeed be compatible. > > For now, the responsibility is left to the users. > > > Even if recovery ultimately fails, users still have the option to roll > back to the original query or start consuming from a new offset by > disabling recovery parameters. > > > > > Best, > Feng > > > On Tue, Dec 17, 2024 at 10:37 AM Ron Liu <ron9....@gmail.com> wrote: > >> Hi Feng >> >> Thanks for initiating this FLIP, in lakehouse, Schema Evolution of tables >> due to modification of business logic is a very common scenario, so >> Materialized Table's support for modification of Query can greatly improve >> flexibility and usability, and we've seen that other similar products in >> the industry also support this capability. >> >> I read the content of this FLIP and the overall design looks good, +1. >> However, I have some questions as follows: >> >> 1. By `ALTER MATERIALIZED TABLE ... AS select` statement to realize the >> add column logic, is it only possible to add columns at the end and not >> anywhere in table schema, some databases have this limitation, does lake >> storage such as Iceberg/Paimon have this limitation? >> 2. In the Refresh Task Behavior section you mention partition hints, is >> it possible to clarify what it is in the FLIP? >> >> >>> *CONTINUOUS Mode: *Stops the old job and starts a new one with the >> updated query. >> >> - The initial position of the new job is controlled by the source >> parameters. >> - For compatible logic changes, recovery parameters >> (execution.state-recovery.path) can be manually set if state >> compatibility >> is confirmed. >> >> >> 4. Are you able to articulate the default behavior? >> 5. How users can determine if states are compatible? >> >> Best, >> Ron >> >> Feng Jin <jinfeng1...@gmail.com> 于2024年12月16日周一 10:49写道: >> >>> Hi, everyone, >>> >>> I’d like to initiate a discussion on FLIP-492: Support Query >>> Modifications for Materialized Tables[1]. >>> >>> In FLIP-435[2], we introduced *MATERIALIZED TABLES*. By defining query >>> logic and specifying data freshness requirements, users can efficiently >>> build data pipelines, greatly improving development productivity. >>> FLIP-492 builds on this by addressing a common need: the ability to >>> modify the query logic of an existing MATERIALIZED TABLE. Two approaches >>> are proposed: >>> >>> >>> *1. Modifying the Query Logic: ALTER MATERIALIZED TABLE AS <query>* >>> Retain historical data while modifying the query logic: >>> >>> ``` >>> ALTER MATERIALIZED TABLE [catalog_name.][db_name.]table_name AS <query> >>> ``` >>> >>> >>> *2. Replacing the Table: CREATE OR REPLACE MATERIALIZED TABLE* >>> Reconstruct the table with a new query, discarding all historical data: >>> >>> ``` >>> CREATE [OR REPLACE] MATERIALIZED TABLE >>> [catalog_name.][db_name.]table_name >>> [ ([<table_constraint>]) ] >>> [COMMENT table_comment] >>> [PARTITIONED BY (partition_column_name1, partition_column_name2, ...)] >>> [WITH (key1=val1, key2=val2, ...)] >>> FRESHNESS = INTERVAL '<num>' { SECOND | MINUTE | HOUR | DAY } >>> [REFRESH_MODE = { CONTINUOUS | FULL }] >>> AS <select_statement> >>> ``` >>> >>> For a more detailed explanation of this proposal, please refer to the >>> FLIP-492[1] documentation. >>> Your feedback and suggestions are highly appreciated to help refine this >>> proposal further. >>> >>> Lastly, I’d like to thank Ron and Lincoln (cc’d) for their valuable >>> input and suggestions during the drafting process. >>> >>> Looking forward to hearing your thoughts! >>> >>> >>> Best, >>> Feng Jin >>> >>> >>> [1]. >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables >>> [2]. >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines >>> >>