Hi, Feng

The reply looks good to me. But I have one question: You mentioned the
`DESC MATERIALIZED TABLE` syntax in FLIP, but we didn't provide this syntax
until now. I think we should add it to this FLIP if needed.

Best,
Ron

Feng Jin <jinfeng1...@gmail.com> 于2024年12月18日周三 16:52写道:

> Hi Ron
>
> Thanks for your reply.
>
> >  Is it only possible to add columns at the end and not anywhere in
> table schema, some databases have this limitation, does lake storage such
> as Iceberg/Paimon have this limitation?
>
>
>  Currently, we can restrict adding columns only to the end of the schema.
> Although both Paimon and Iceberg already support adding columns anywhere,
> there are still some systems that do not. I will include this in the FLIP.
>
>
> > In the Refresh Task Behavior section you mention partition hints, is it
> possible to clarify what it is in the FLIP?
>
>
> I have added the relevant details.
>
>
> >  Are you able to articulate the default behavior?
>
>
> The detailed explanation for this part has been updated.
>
>
> >  How users can determine if states are compatible?
>
>
> Users can only rely on their experience to make modifications. Currently,
> the Flink framework does not guarantee that changes to SQL logic will
> maintain state compatibility.
>
> I think we can add some suggestions in the user documentation in the
> future. While the framework itself cannot ensure state compatibility, some
> simple modification scenarios can indeed be compatible.
>
> For now, the responsibility is left to the users.
>
>
> Even if recovery ultimately fails, users still have the option to roll
> back to the original query or start consuming from a new offset by
> disabling recovery parameters.
>
>
>
>
> Best,
> Feng
>
>
> On Tue, Dec 17, 2024 at 10:37 AM Ron Liu <ron9....@gmail.com> wrote:
>
>> Hi Feng
>>
>> Thanks for initiating this FLIP, in lakehouse, Schema Evolution of tables
>> due to modification of business logic is a very common scenario, so
>> Materialized Table's support for modification of Query can greatly improve
>> flexibility and usability, and we've seen that other similar products in
>> the industry also support this capability.
>>
>> I read the content of this FLIP and the overall design looks good, +1.
>> However, I have some questions as follows:
>>
>> 1. By `ALTER MATERIALIZED TABLE ... AS select` statement to realize the
>> add column logic, is it only possible to add columns at the end and not
>> anywhere in table schema, some databases have this limitation, does lake
>> storage such as Iceberg/Paimon have this limitation?
>> 2. In the Refresh Task Behavior section you mention partition hints, is
>> it possible to clarify what it is in the FLIP?
>>
>> >>> *CONTINUOUS Mode: *Stops the old job and starts a new one with the
>> updated query.
>>
>>    - The initial position of the new job is controlled by the source
>>    parameters.
>>    - For compatible logic changes, recovery parameters
>>    (execution.state-recovery.path)  can be manually set if state 
>> compatibility
>>    is confirmed.
>>
>>
>> 4. Are you able to articulate the default behavior?
>> 5. How users can determine if states are compatible?
>>
>> Best,
>> Ron
>>
>> Feng Jin <jinfeng1...@gmail.com> 于2024年12月16日周一 10:49写道:
>>
>>> Hi, everyone,
>>>
>>> I’d like to initiate a discussion on FLIP-492: Support Query
>>> Modifications for Materialized Tables[1].
>>>
>>> In FLIP-435[2], we introduced *MATERIALIZED TABLES*. By defining query
>>> logic and specifying data freshness requirements, users can efficiently
>>> build data pipelines, greatly improving development productivity.
>>> FLIP-492 builds on this by addressing a common need: the ability to
>>> modify the query logic of an existing MATERIALIZED TABLE. Two approaches
>>> are proposed:
>>>
>>>
>>> *1. Modifying the Query Logic: ALTER MATERIALIZED TABLE AS <query>*
>>> Retain historical data while modifying the query logic:
>>>
>>> ```
>>> ALTER MATERIALIZED TABLE [catalog_name.][db_name.]table_name AS <query>
>>> ```
>>>
>>>
>>> *2. Replacing the Table: CREATE OR REPLACE MATERIALIZED TABLE*
>>> Reconstruct the table with a new query, discarding all historical data:
>>>
>>> ```
>>> CREATE [OR REPLACE] MATERIALIZED TABLE
>>> [catalog_name.][db_name.]table_name
>>> [ ([<table_constraint>]) ]
>>> [COMMENT table_comment]
>>> [PARTITIONED BY (partition_column_name1, partition_column_name2, ...)]
>>> [WITH (key1=val1, key2=val2, ...)]
>>> FRESHNESS = INTERVAL '<num>' { SECOND | MINUTE | HOUR | DAY }
>>> [REFRESH_MODE = { CONTINUOUS | FULL }]
>>> AS <select_statement>
>>> ```
>>>
>>> For a more detailed explanation of this proposal, please refer to the
>>> FLIP-492[1] documentation.
>>> Your feedback and suggestions are highly appreciated to help refine this
>>> proposal further.
>>>
>>> Lastly, I’d like to thank Ron and Lincoln (cc’d) for their valuable
>>> input and suggestions during the drafting process.
>>>
>>> Looking forward to hearing your thoughts!
>>>
>>>
>>> Best,
>>> Feng Jin
>>>
>>>
>>> [1].
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-492%3A+Support+Query+Modifications+for+Materialized+Tables
>>> [2].
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-435%3A+Introduce+a+New+Materialized+Table+for+Simplifying+Data+Pipelines
>>>
>>

Reply via email to