Hi
Zuo and Lincoln

Thanks for your reply.

@Zuo

>  detects whether the modified state is compatioble or not with the
previous state automatically ?


Automatic detection of state compatibility is technically feasible, but
currently, there is no ready-made interface to check whether the job before
and after modification is compatible. However, this issue may be beyond the
scope of this FLIP. Regarding the topic of Flink SQL state compatibility, I
believe a separate FLIP is needed to describe its behavior in more detail.




@Lincoln


> And for the default behavior of alter operation under continuous mode,
can you add an example of starting the new job with hints(similar to the
case in the User Story section)?


Thank you for the suggestion. The compatibility recovery for users is not
easy to judge. I think removing it is reasonable, as it is not suitable to
be provided as a public feature. I have already updated the relevant
content.



Best,
Feng Jin


On Thu, Dec 19, 2024 at 10:08 PM Lincoln Lee <lincoln.8...@gmail.com> wrote:

> Thanks Feng for driving this!
> Supporting modification is an important improvement for Materialized Table.
>
> Regarding the alter table reserve historic data, I have similar question
> with Ron,
> Users can't easily to judge whether a change is simple enough to keep state
> compatibility with old refresh job under the continuous mode. Therefore, I
> suggest removing the description of “Compatibile Recovery” from the public
> inferface section.
>
> And for the default behavior of alter operation under continuous mode, can
> you
> add an example of starting the new job with hints(similar to the case in
> the
> User Story section)?
>
> Best,
> Lincoln Lee
>
>
> Wei Zuo <1015766...@qq.com.invalid> 于2024年12月19日周四 14:05写道:
>
> > Hi, Feng
> >
> >
> > Is it possible that the framework detects whether the modified state is
> > compatioble or not with the previous state automatically?&nbsp;It would
> be
> > better to recognize query state compatibility automatically.
> >
> >
> > Best,
> >
> >
> > Zuo Wei
> >
> >
> >
> >
> > ------------------&nbsp;原始邮件&nbsp;------------------
> > 发件人:
> >                                                   "dev"
> >                                                                 <
> > ron9....@gmail.com&gt;;
> > 发送时间:&nbsp;2024年12月19日(星期四) 上午10:15
> > 收件人:&nbsp;"Feng Jin"<jinfeng1...@gmail.com&gt;;
> > 抄送:&nbsp;"dev"<dev@flink.apache.org&gt;;"Lincoln Lee"<
> > lincoln.8...@gmail.com&gt;;
> > 主题:&nbsp;Re: [DISCUSS] FLIP-492: Support Query Modifications for
> > Materialized Tables.
> >
> >
> >
> > Hi, Feng
> >
> > The reply looks good to me. But I have one question: You mentioned the
> > `DESC MATERIALIZED TABLE` syntax in FLIP, but we didn't provide this
> syntax
> > until now. I think we should add it to this FLIP if needed.
> >
> > Best,
> > Ron
> >
> > Feng Jin <jinfeng1...@gmail.com&gt; 于2024年12月18日周三 16:52写道:
> >
> > &gt; Hi Ron
> > &gt;
> > &gt; Thanks for your reply.
> > &gt;
> > &gt; &gt;&nbsp; Is it only possible to add columns at the end and not
> > anywhere in
> > &gt; table schema, some databases have this limitation, does lake storage
> > such
> > &gt; as Iceberg/Paimon have this limitation?
> > &gt;
> > &gt;
> > &gt;&nbsp; Currently, we can restrict adding columns only to the end of
> > the schema.
> > &gt; Although both Paimon and Iceberg already support adding columns
> > anywhere,
> > &gt; there are still some systems that do not. I will include this in the
> > FLIP.
> > &gt;
> > &gt;
> > &gt; &gt; In the Refresh Task Behavior section you mention partition
> > hints, is it
> > &gt; possible to clarify what it is in the FLIP?
> > &gt;
> > &gt;
> > &gt; I have added the relevant details.
> > &gt;
> > &gt;
> > &gt; &gt;&nbsp; Are you able to articulate the default behavior?
> > &gt;
> > &gt;
> > &gt; The detailed explanation for this part has been updated.
> > &gt;
> > &gt;
> > &gt; &gt;&nbsp; How users can determine if states are compatible?
> > &gt;
> > &gt;
> > &gt; Users can only rely on their experience to make modifications.
> > Currently,
> > &gt; the Flink framework does not guarantee that changes to SQL logic
> will
> > &gt; maintain state compatibility.
> > &gt;
> > &gt; I think we can add some suggestions in the user documentation in the
> > &gt; future. While the framework itself cannot ensure state
> compatibility,
> > some
> > &gt; simple modification scenarios can indeed be compatible.
> > &gt;
> > &gt; For now, the responsibility is left to the users.
> > &gt;
> > &gt;
> > &gt; Even if recovery ultimately fails, users still have the option to
> roll
> > &gt; back to the original query or start consuming from a new offset by
> > &gt; disabling recovery parameters.
> > &gt;
> > &gt;
> > &gt;
> > &gt;
> > &gt; Best,
> > &gt; Feng
> > &gt;
> > &gt;
> > &gt; On Tue, Dec 17, 2024 at 10:37 AM Ron Liu <ron9....@gmail.com&gt;
> > wrote:
> > &gt;
> > &gt;&gt; Hi Feng
> > &gt;&gt;
> > &gt;&gt; Thanks for initiating this FLIP, in lakehouse, Schema Evolution
> > of tables
> > &gt;&gt; due to modification of business logic is a very common scenario,
> > so
> > &gt;&gt; Materialized Table's support for modification of Query can
> > greatly improve
> > &gt;&gt; flexibility and usability, and we've seen that other similar
> > products in
> > &gt;&gt; the industry also support this capability.
> > &gt;&gt;
> > &gt;&gt; I read the content of this FLIP and the overall design looks
> > good, +1.
> > &gt;&gt; However, I have some questions as follows:
> > &gt;&gt;
> > &gt;&gt; 1. By `ALTER MATERIALIZED TABLE ... AS select` statement to
> > realize the
> > &gt;&gt; add column logic, is it only possible to add columns at the end
> > and not
> > &gt;&gt; anywhere in table schema, some databases have this limitation,
> > does lake
> > &gt;&gt; storage such as Iceberg/Paimon have this limitation?
> > &gt;&gt; 2. In the Refresh Task Behavior section you mention partition
> > hints, is
> > &gt;&gt; it possible to clarify what it is in the FLIP?
> > &gt;&gt;
> > &gt;&gt; &gt;&gt;&gt; *CONTINUOUS Mode: *Stops the old job and starts a
> > new one with the
> > &gt;&gt; updated query.
> > &gt;&gt;
> > &gt;&gt;&nbsp;&nbsp;&nbsp; - The initial position of the new job is
> > controlled by the source
> > &gt;&gt;&nbsp;&nbsp;&nbsp; parameters.
> > &gt;&gt;&nbsp;&nbsp;&nbsp; - For compatible logic changes, recovery
> > parameters
> > &gt;&gt;&nbsp;&nbsp;&nbsp; (execution.state-recovery.path)&nbsp; can be
> > manually set if state compatibility
> > &gt;&gt;&nbsp;&nbsp;&nbsp; is confirmed.
> > &gt;&gt;
> > &gt;&gt;
> > &gt;&gt; 4. Are you able to articulate the default behavior?
> > &gt;&gt; 5. How users can determine if states are compatible?
> > &gt;&gt;
> > &gt;&gt; Best,
> > &gt;&gt; Ron
> > &gt;&gt;
> > &gt;&gt; Feng Jin <jinfeng1...@gmail.com&gt; 于2024年12月16日周一 10:49写道:
> > &gt;&gt;
> > &gt;&gt;&gt; Hi, everyone,
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; I’d like to initiate a discussion on FLIP-492: Support Query
> > &gt;&gt;&gt; Modifications for Materialized Tables[1].
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; In FLIP-435[2], we introduced *MATERIALIZED TABLES*. By
> > defining query
> > &gt;&gt;&gt; logic and specifying data freshness requirements, users can
> > efficiently
> > &gt;&gt;&gt; build data pipelines, greatly improving development
> > productivity.
> > &gt;&gt;&gt; FLIP-492 builds on this by addressing a common need: the
> > ability to
> > &gt;&gt;&gt; modify the query logic of an existing MATERIALIZED TABLE.
> Two
> > approaches
> > &gt;&gt;&gt; are proposed:
> > &gt;&gt;&gt;
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; *1. Modifying the Query Logic: ALTER MATERIALIZED TABLE AS
> > <query&gt;*
> > &gt;&gt;&gt; Retain historical data while modifying the query logic:
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; ```
> > &gt;&gt;&gt; ALTER MATERIALIZED TABLE [catalog_name.][db_name.]table_name
> > AS <query&gt;
> > &gt;&gt;&gt; ```
> > &gt;&gt;&gt;
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; *2. Replacing the Table: CREATE OR REPLACE MATERIALIZED
> TABLE*
> > &gt;&gt;&gt; Reconstruct the table with a new query, discarding all
> > historical data:
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; ```
> > &gt;&gt;&gt; CREATE [OR REPLACE] MATERIALIZED TABLE
> > &gt;&gt;&gt; [catalog_name.][db_name.]table_name
> > &gt;&gt;&gt; [ ([<table_constraint&gt;]) ]
> > &gt;&gt;&gt; [COMMENT table_comment]
> > &gt;&gt;&gt; [PARTITIONED BY (partition_column_name1,
> > partition_column_name2, ...)]
> > &gt;&gt;&gt; [WITH (key1=val1, key2=val2, ...)]
> > &gt;&gt;&gt; FRESHNESS = INTERVAL '<num&gt;' { SECOND | MINUTE | HOUR |
> > DAY }
> > &gt;&gt;&gt; [REFRESH_MODE = { CONTINUOUS | FULL }]
> > &gt;&gt;&gt; AS <select_statement&gt;
> > &gt;&gt;&gt; ```
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; For a more detailed explanation of this proposal, please
> > refer to the
> > &gt;&gt;&gt; FLIP-492[1] documentation.
> > &gt;&gt;&gt; Your feedback and suggestions are highly appreciated to help
> > refine this
> > &gt;&gt;&gt; proposal further.
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; Lastly, I’d like to thank Ron and Lincoln (cc’d) for their
> > valuable
> > &gt;&gt;&gt; input and suggestions during the drafting process.
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; Looking forward to hearing your thoughts!
> > &gt;&gt;&gt;
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; Best,
> > &gt;&gt;&gt; Feng Jin
> > &gt;&gt;&gt;
> > &gt;&gt;&gt;
> > &gt;&gt;&gt; [1].
> > &gt;&gt;&gt;
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables
> > &gt;&gt;&gt
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables&gt;&gt;&gt
> >;
> > [2].
> > &gt;&gt;&gt;
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
> > &gt;&gt;&gt
> > <
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines&gt;&gt;&gt
> >
> > ;
> > &gt;&gt;
>

Reply via email to