Re: Dedicated sync for Iceberg materialized view

Igor Belianski Fri, 05 Dec 2025 13:25:44 -0800

Hi Benny and Walaa,

Could you please clarify the following statement from Benny's last email:


"We should avoid the need for consumers to expand nested MVs. I think the
producer should be combining the refresh states of all the nested MVs it
uses into two flat lists of source views and source tables. These source
tables can't contain storage tables."

Benny: Are you suggesting that the source-table-states should only capture
the leaf table nodes in the MV dependency pipeline?

Walaa: If we completely enumerate all source tables recursively, we must
consider the scenario of shared upstream tables (the "diamond pattern").
Specifically:

1. Do we allow duplicate table entries in the list (e.g., if the same table
was used at different snapshots across different refresh path traversals)?
2. Would we need to include the path in the refresh state entry to make
this data interpretable?

If we pursue the option of listing everything in the tree, we should choose
between:

- A) Permissive: Allow duplicate table entries, treating the list as a
client hint for tables to check. This leaves it up to engines to
disambiguate or skip entries, and the list may not be strictly exhaustive.
- B) Prescriptive: Establish an exactly defined meaning for each entry,
mandating clear rules for aggregation.

I am highly hesitant to mandate option B ( it would obviously be too
prescriptive for most engines).

Thanks,
Igor

On Thu, Dec 4, 2025 at 9:28 PM Benny Chow <[email protected]> wrote:

> I agree with Walaa.  In the last sync, we talked about smart producers and
> dumb consumers.  We should avoid the need for consumers to expand nested
> MVs.  I think the producer should be combining the refresh states of all
> the nested MVs it uses into two flat lists of source views and source
> tables.  These source tables can't contain storage tables.  When planning
> the refresh job, the producer can choose to use the nested MV's storage
> table or not and the refresh state needs to reflect this decision
> accordingly.
>
> There's also a somewhat corner case to consider.  It is completely
> possible for a source table to show up in a materialization at different
> snapshots.  In this scenario, it's up to the producer to decide whether to
> allow this or not or maybe just record the earliest snapshot.  These
> scenarios are inevitable when you get MVs built on MVs built on MVs such as
> in ETL scenarios.
>
> For max-staleness, I think it should strictly apply to only source tables
> and not storage tables.  For the ETL pipeline use case, the consumer is
> probably not going to care about this property.
>
> Thanks
>
> On Thu, Dec 4, 2025 at 5:37 PM Walaa Eldin Moustafa <[email protected]>
> wrote:
>
>> I think this creates significant friction for engine implementations (and
>> contradicts some of the principles we established earlier):
>>
>> * When the engine sees the tables backing mv_3, the spec provides no
>> built-in way to distinguish MV storage tables from true physical tables.
>> The engine must always perform an external lookup to determine whether a
>> “table” is really an MV.
>>
>> * Freshness evaluation becomes inconsistent: nested logical views require
>> only a one-shot leaf-table comparison, while nested MVs require recursive
>> traversal because their refresh-state does not contain leaf snapshots.
>>
>> * Even if an engine uses the MV definition to detect deeper staleness, it
>> cannot refresh the MV to a consistent base-table state. Option 2 refresh
>> semantics stop at the immediate MV boundary, so recursion for freshness
>> does not align with recursion for refresh.
>>
>> For these reasons, “allowing recursive expansion” is not practically
>> usable. It introduces complexity without providing coherent semantics.
>>
>> In summary, treating MVs either as views or as tables yields a consistent
>> model, but the optionality implied in Option 2 is misleading. The metadata
>> does not support cleanly mixing the two modes.
>>
>> Thanks,
>> Walaa.
>>
>> On Thu, Dec 4, 2025 at 11:44 AM Steven Wu <[email protected]> wrote:
>>
>>> Walaa,
>>>
>>> We are saying the `refres-state` only has a source view state for the
>>> source MV and a source table state for the source MV's storage table. It
>>> would allow both evaluation strategies (recursive or not)
>>> * non-recursive: as long as the MV refresh state is aligned with the
>>> source MV's storage table (with max staleness config), it is fresh. This
>>> semantic matches many ETL pipeline use cases.
>>> * recursive: If an engine wants to enforce stronger freshness semantics,
>>> it can recursively evaluate if source mv_1 and mv_2 themselves are fresh.
>>> The current spec wording mentioned this is allowed: "query engines may
>>> recursively expand the query tree to determine freshness".
>>>
>>> We wants the spec definition to be flexible enough to support both use
>>> cases.
>>>
>>> Thanks,
>>> Steven
>>>
>>>
>>> On Wed, Dec 3, 2025 at 5:30 PM Walaa Eldin Moustafa <
>>> [email protected]> wrote:
>>>
>>>> Hi Steven,
>>>>
>>>> > In option 2, when determining the freshness of mv_3, engines can
>>>> choose to recursively evaluate the freshness of mv_1 and mv_2 since they
>>>> are also MVs. But engines can also choose not to.
>>>>
>>>> Does not "evaluating freshness of mv_1 and mv_2" mean that engines
>>>> consider mv_1 and mv_2 as views? "Tables" do not have freshness.
>>>>
>>>> Thanks,
>>>> Walaa.
>>>>
>>>>
>>>> On Tue, Nov 25, 2025 at 11:49 PM Jan Kaul via dev <
>>>> [email protected]> wrote:
>>>>
>>>>> Thank you Steven,
>>>>>
>>>>> I've included the "max-staleness" in the PR. Please have a look and
>>>>> give feedback on the phrasing.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Jan
>>>>> On 11/19/25 22:37, Steven Wu wrote:
>>>>>
>>>>> Thanks everyone for joining today's sync. We had a good discussion on
>>>>> how to interpret the "max staleness" config.
>>>>>
>>>>> You can find the meeting notes here.
>>>>>
>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.eho7jgm13usg
>>>>>
>>>>> Recording is also linked in the doc (thanks Kevin).
>>>>>
>>>>> For the next step, maybe we can collaborate on the MV spec PR to flush
>>>>> the exact wording for staleness config and semantic.
>>>>> https://github.com/apache/iceberg/pull/11041/files
>>>>>
>>>>>
>>>>> On Tue, Nov 18, 2025 at 1:05 PM Benny Chow <[email protected]> wrote:
>>>>>
>>>>>> Thanks Igor.  The PR has a suggestion for exactly what you
>>>>>> suggested.  I called it a "*warm*" state which is a state where
>>>>>> stale materialization can still be used.
>>>>>> https://github.com/apache/iceberg/pull/11041/files#r2474661166
>>>>>>
>>>>>> I think if we continue with the assumption that MVs can only
>>>>>> reference iceberg tables and views, then it makes sense for the
>>>>>> max-staleness grace period to be dynamic based on snapshot history.   
>>>>>> This
>>>>>> is what Trino does:
>>>>>> https://trino.io/docs/current/connector/iceberg.html?utm_source=chatgpt.com#materialized-views
>>>>>>
>>>>>> If there are non-Iceberg tables in the view SQL, then the grace
>>>>>> period will have to be based on last refresh which is also what Trino
>>>>>> describes here:
>>>>>> https://trino.io/docs/current/sql/create-materialized-view.html#mv-grace-period
>>>>>>
>>>>>> Should we call out both scenarios in the MV spec?  I think this is
>>>>>> worth being explicit here.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>> On Tue, Nov 18, 2025 at 11:03 AM Igor Belianski <
>>>>>> [email protected]> wrote:
>>>>>>
>>>>>>> Re:  max-stalenss-ms interpretation
>>>>>>> proposal:
>>>>>>>    A Materialized View(MNV) considered fresh if and only if the
>>>>>>> results stored are equivalent to the those that would have been 
>>>>>>> obtained by
>>>>>>> running MV's defining query at some point in time within interval :
>>>>>>>  [CurrentTime-max-staleness-ms, Current_time]
>>>>>>>
>>>>>>> Note: this definition allows for optimization proposed by option 2
>>>>>>> (implementing which is definitely a great idea) , but doesn't mandate 
>>>>>>> it.
>>>>>>>  One can also imagine some other optimization that would be possible
>>>>>>> given definition above , and would be left up to the engines toi 
>>>>>>> implement.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Nov 18, 2025 at 10:54 AM Steven Wu <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> A reminder for tomorrow's community sync for the MV spec.
>>>>>>>> https://calendar.app.google/T4zSk6qKWoy1vV6P7
>>>>>>>>
>>>>>>>> We have one open question from the last meeting on how
>>>>>>>> `max-stalenesss-ms` should be interpreted. You can find more details 
>>>>>>>> in the
>>>>>>>> meeting notes.
>>>>>>>>
>>>>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o
>>>>>>>>
>>>>>>>> Please also bring other topics that we should discuss.
>>>>>>>>
>>>>>>>> On Sat, Nov 1, 2025 at 10:14 PM Steven Wu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Sorry for the delay. Here are the recording and meeting notes for
>>>>>>>>> the MV sync meeting on Wednesday, Oct 29.
>>>>>>>>>
>>>>>>>>> https://docs.google.com/document/d/1EVCM-hKr5tY33t0Yzq37cAXSPncySc6Ghke7OZEcqXU/edit?tab=t.0#heading=h.75r8e0rwq02o
>>>>>>>>>
>>>>>>>>> We have started to collect them in the above google doc.
>>>>>>>>>
>>>>>>>>> On Mon, Oct 27, 2025 at 8:58 AM Péter Váry <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> If we have materialized views (MVs) and support for incremental
>>>>>>>>>> change scans, then by introducing a Java-based representation of the 
>>>>>>>>>> view,
>>>>>>>>>> we can expose a scan API that always returns up-to-date results for 
>>>>>>>>>> the MV.
>>>>>>>>>>
>>>>>>>>>> The scan could include multiple tasks:
>>>>>>>>>>
>>>>>>>>>>    - A task for reading the current version of the MV.
>>>>>>>>>>    - An incremental change log scan covering the range between
>>>>>>>>>>    the snapshot ID of the source table at the time the MV was last 
>>>>>>>>>> refreshed
>>>>>>>>>>    and its current snapshot ID. Applying the Java representation of 
>>>>>>>>>> the view
>>>>>>>>>>    when transformations are required.
>>>>>>>>>>
>>>>>>>>>> This approach allows us to build an always up-to-date index
>>>>>>>>>> table/single source MV, using existing components.
>>>>>>>>>>
>>>>>>>>>> Benny Chow <[email protected]> ezt írta (időpont: 2025. okt. 24.,
>>>>>>>>>> P, 7:44):
>>>>>>>>>>
>>>>>>>>>>> Hi Peter
>>>>>>>>>>>
>>>>>>>>>>> I think the current proposal would support your example.  In
>>>>>>>>>>> most situations, replace table operations after a view is 
>>>>>>>>>>> materialized
>>>>>>>>>>> wouldn’t invalidate the materialization.  However, if the view 
>>>>>>>>>>> includes
>>>>>>>>>>> metadata columns, then the replace operations should invalidate the
>>>>>>>>>>> materialization.
>>>>>>>>>>>
>>>>>>>>>>> This also brings up another important point that engines will
>>>>>>>>>>> differ on what views can be materialized or not.  For example, maybe
>>>>>>>>>>> metadata columns are not allowed similar to non deterministic 
>>>>>>>>>>> functions
>>>>>>>>>>> like random.  But some engines like Dremio may allow views that use 
>>>>>>>>>>> current
>>>>>>>>>>> date functions.  It should be possible for one engine to 
>>>>>>>>>>> materialize a view
>>>>>>>>>>> and another engine to look at the query tree and decide it’s not a 
>>>>>>>>>>> view it
>>>>>>>>>>> supports materializations on and choose not to use that 
>>>>>>>>>>> materialization.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Benny
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Oct 23, 2025, at 8:44 AM, Péter Váry <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>> 
>>>>>>>>>>> Hi All,
>>>>>>>>>>>
>>>>>>>>>>> I’ve been catching up on the discussion and wanted to share an
>>>>>>>>>>> observation. One aspect that stands out to me in the proposed 
>>>>>>>>>>> staleness
>>>>>>>>>>> evaluation logic is that snapshots which don’t modify data can 
>>>>>>>>>>> still affect
>>>>>>>>>>> the view’s contents if the view includes metadata columns.
>>>>>>>>>>>
>>>>>>>>>>> I was considering using a materialized view as an index for a
>>>>>>>>>>> given table to accelerate the conversion of equality deletes to 
>>>>>>>>>>> position
>>>>>>>>>>> deletes. For example, the query might look like:
>>>>>>>>>>>
>>>>>>>>>>> *SELECT _POS, _FILE, id FROM target_table*
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> During compaction, the materialized view would need to be
>>>>>>>>>>> refreshed to ensure it reflects the correct data.
>>>>>>>>>>>
>>>>>>>>>>> Does this seem like a valid use case? Or should we explicitly
>>>>>>>>>>> exclude scenarios like this?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Peter
>>>>>>>>>>>
>>>>>>>>>>> Steven Wu <[email protected]> ezt írta (időpont: 2025. okt.
>>>>>>>>>>> 20., H, 17:30):
>>>>>>>>>>>
>>>>>>>>>>>> Walaa,
>>>>>>>>>>>>
>>>>>>>>>>>> > while Option 2 is described in your summary as "giving
>>>>>>>>>>>> engines *flexibility* to determine freshness recursively
>>>>>>>>>>>> beyond a source MV", that *isn’t achievable* under the MV
>>>>>>>>>>>> evaluation model itself.
>>>>>>>>>>>> Because each MV treats upstream MVs as physical tables,
>>>>>>>>>>>> recursion stops at the first materialized boundary; *deeper
>>>>>>>>>>>> staleness cannot be discovered without switching to a logical-view
>>>>>>>>>>>> evaluation model, i.e., stepping outside the MV model altogether 
>>>>>>>>>>>> (note that
>>>>>>>>>>>> in Option 3 we can determine recursive staleness while still 
>>>>>>>>>>>> inside the MV
>>>>>>>>>>>> model).*
>>>>>>>>>>>>
>>>>>>>>>>>> In option 2, when determining the freshness of mv_3, engines
>>>>>>>>>>>> can choose to recursively evaluate the freshness of mv_1 and mv_2 
>>>>>>>>>>>> since
>>>>>>>>>>>> they are also MVs. But engines can also choose not to.
>>>>>>>>>>>>
>>>>>>>>>>>> > This means that there seems to be an implicit “Option 3”.
>>>>>>>>>>>> This option treats MVs as logical views, i.e., storing only view 
>>>>>>>>>>>> versions +
>>>>>>>>>>>> base table snapshot IDs (no MV storage snapshot IDs, no per-path 
>>>>>>>>>>>> lineage).
>>>>>>>>>>>>
>>>>>>>>>>>> In the new option 3 you described, how could the engine update
>>>>>>>>>>>> mv3's refresh state for base table_a and table_b? unless all 
>>>>>>>>>>>> connected MVs
>>>>>>>>>>>> are refreshed and committed in one single transaction, one entry 
>>>>>>>>>>>> per base
>>>>>>>>>>>> table doesn't seem feasible. That's the main reason for option 1 
>>>>>>>>>>>> to require
>>>>>>>>>>>> the lineage path information in refresh state for base tables.
>>>>>>>>>>>>
>>>>>>>>>>>> It also seems that option 3 can only interpret freshness
>>>>>>>>>>>> recursively, while today there are engines that support MVs without
>>>>>>>>>>>> recursively evaluating source MVs.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Steven
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Oct 20, 2025 at 1:44 AM Walaa Eldin Moustafa <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Steven,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for organizing the series and summarizing the outcome.
>>>>>>>>>>>>>
>>>>>>>>>>>>> After re-reading the Option 1/2 proposal, initially I
>>>>>>>>>>>>> interpreted Option 1 as simply expanding MVs like regular logical 
>>>>>>>>>>>>> views. On
>>>>>>>>>>>>> closer look, it is actually more complex. It also preserves 
>>>>>>>>>>>>> per-path
>>>>>>>>>>>>> lineage state (e.g., multiple entries for the same base table via 
>>>>>>>>>>>>> different
>>>>>>>>>>>>> parents), which increases expressiveness but significantly 
>>>>>>>>>>>>> increases
>>>>>>>>>>>>> metadata complexity. So I agree it is not a practical option.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This means that there seems to be an implicit “Option 3”. This
>>>>>>>>>>>>> option treats MVs as logical views, i.e., storing only view 
>>>>>>>>>>>>> versions + base
>>>>>>>>>>>>> table snapshot IDs (no MV storage snapshot IDs, no per-path 
>>>>>>>>>>>>> lineage). Under
>>>>>>>>>>>>> this model, mv_3’s metadata might look like:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Type   Name     Tracked State
>>>>>>>>>>>>> -----  -------  -----------------------
>>>>>>>>>>>>> view   mv_1     view_version_id
>>>>>>>>>>>>> view   mv_2     view_version_id
>>>>>>>>>>>>> table  table_a  table_snapshot_id
>>>>>>>>>>>>> table  table_b  table_snapshot_id
>>>>>>>>>>>>>
>>>>>>>>>>>>> This preserves logical semantics and aligns MV behavior with
>>>>>>>>>>>>> pure views.
>>>>>>>>>>>>>
>>>>>>>>>>>>> *If we choose Option 2 (treat source MV as a materialized
>>>>>>>>>>>>> table), we may have to be consider those constraints:*
>>>>>>>>>>>>>
>>>>>>>>>>>>> * Staleness only degrades up the chain. mv_1 and mv_2 may
>>>>>>>>>>>>> already be stale relative to the base tables, but if mv_3 is 
>>>>>>>>>>>>> refreshed
>>>>>>>>>>>>> using their storage snapshots, then mv_3 will be marked as fresh 
>>>>>>>>>>>>> under
>>>>>>>>>>>>> Option 2, even though all three MVs are stale relative to the 
>>>>>>>>>>>>> base tables.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * Engines can no longer discover staleness beyond mv_1. Once
>>>>>>>>>>>>> mv_3 sees mv_1 (or mv_2) as fresh based only on their storage 
>>>>>>>>>>>>> snapshots, it
>>>>>>>>>>>>> will not expand into mv_1 or mv_2 to check whether they are stale 
>>>>>>>>>>>>> relative
>>>>>>>>>>>>> to the base tables.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * If mv_2 and mv_3 were purely logical views instead of MVs,
>>>>>>>>>>>>> they would evaluate directly against base tables and return newer 
>>>>>>>>>>>>> data.
>>>>>>>>>>>>> Under Option 2, the same definitions but materialized upstream 
>>>>>>>>>>>>> produce
>>>>>>>>>>>>> different data, not just different metadata.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Therefore, while Option 2 is described in your summary as
>>>>>>>>>>>>> "giving engines *flexibility* to determine freshness
>>>>>>>>>>>>> recursively beyond a source MV", that *isn’t achievable*
>>>>>>>>>>>>> under the MV evaluation model itself.
>>>>>>>>>>>>> Because each MV treats upstream MVs as physical tables,
>>>>>>>>>>>>> recursion stops at the first materialized boundary; *deeper
>>>>>>>>>>>>> staleness cannot be discovered without switching to a logical-view
>>>>>>>>>>>>> evaluation model, i.e., stepping outside the MV model altogether 
>>>>>>>>>>>>> (note that
>>>>>>>>>>>>> in Option 3 we can determine recursive staleness while still 
>>>>>>>>>>>>> inside the MV
>>>>>>>>>>>>> model).*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let me know your thoughts. I slightly prefer Option 3. I’m
>>>>>>>>>>>>> also fine with Option 2, but I don’t think the flexibility to 
>>>>>>>>>>>>> recursively
>>>>>>>>>>>>> determine freshness actually exists under its evaluation model. 
>>>>>>>>>>>>> Not sure if
>>>>>>>>>>>>> this changes anyone’s view, but I wanted to clarify how I’m 
>>>>>>>>>>>>> reading it.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 11:11 PM Benny Chow <[email protected]>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I just listened to the recording.  I'm the tech lead for MVs
>>>>>>>>>>>>>> at Dremio and responsible for both refresh management and query 
>>>>>>>>>>>>>> rewrites
>>>>>>>>>>>>>> with MVs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's great that we seem to agree that Iceberg MV spec won't
>>>>>>>>>>>>>> require that MVs always be up to date in order to be usable for 
>>>>>>>>>>>>>> query
>>>>>>>>>>>>>> rewrites.  There can be many data consistency issues (as Dan 
>>>>>>>>>>>>>> pointed out)
>>>>>>>>>>>>>> but that is the state of affairs today.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It sounds like we are converging on the following scenarios
>>>>>>>>>>>>>> for an engine to validate the MV freshness:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1.  Use storage table without any validation.  This might be
>>>>>>>>>>>>>> the extreme "async MV" example.
>>>>>>>>>>>>>> 2.  Ignore storage table even if one exists because SQL
>>>>>>>>>>>>>> command or use case requires that.
>>>>>>>>>>>>>> 3.  Use storage table only if data is not more than x hours
>>>>>>>>>>>>>> old.  This can be achieved with the proposed 
>>>>>>>>>>>>>> refresh-start-timestamp-ms
>>>>>>>>>>>>>> which is currently in the proposed spec.  For this to work
>>>>>>>>>>>>>> with MVs built on MVs, we should probably state in the spec that 
>>>>>>>>>>>>>> if a MV is
>>>>>>>>>>>>>> built on another MV, then it needs to inherit the
>>>>>>>>>>>>>> refresh-start-timestamp-ms of the child MV.  In Steven's 
>>>>>>>>>>>>>> example, when
>>>>>>>>>>>>>> building mv3, refresh-start-timestamp-ms needs to be set to the 
>>>>>>>>>>>>>> minimum of
>>>>>>>>>>>>>> mv1 or mv2's refresh-start-timestamp-ms.  If this property name 
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> confusing, we can rename it to 
>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms".  I
>>>>>>>>>>>>>> originally proposed this property and also listed out other 
>>>>>>>>>>>>>> benefits here:
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1779797796
>>>>>>>>>>>>>> Also, at the time, MVs built on MVs weren't being considered.  
>>>>>>>>>>>>>> Now that it
>>>>>>>>>>>>>> is, I would recommend we have both "refresh-start-timestamp-ms" 
>>>>>>>>>>>>>> (when the
>>>>>>>>>>>>>> refresh was started on the storage table) and
>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms" (used for freshness 
>>>>>>>>>>>>>> validation).
>>>>>>>>>>>>>> 4.  Don't use the storage table if it is older than X hours.
>>>>>>>>>>>>>> This is what I had originally proposed for the
>>>>>>>>>>>>>> *materialization.max-stalessness-ms* view property here:
>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041#discussion_r1744837644
>>>>>>>>>>>>>> It wasn't meant to validate the freshness but more to prevent 
>>>>>>>>>>>>>> use of a
>>>>>>>>>>>>>> materialization after some criteria.
>>>>>>>>>>>>>> 5.  Use storage table if recursive validation passes... i.e.
>>>>>>>>>>>>>> refresh-state matches the current expanded query tree state.  
>>>>>>>>>>>>>> This is what
>>>>>>>>>>>>>> I think Steven is calling the "synchronous MV".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For scenario 1-4, it would support the nice use case of an
>>>>>>>>>>>>>> Iceberg client using a view's data through the storage table 
>>>>>>>>>>>>>> without
>>>>>>>>>>>>>> needing to know how to parse/validate/expand any view SQLs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In Dremio's planner, we primarily use scenario 1 and 4
>>>>>>>>>>>>>> together to determine MV validity for query rewrite.  Scenario 2 
>>>>>>>>>>>>>> and 5 also
>>>>>>>>>>>>>> apply in certain situations.  For scenario 3, Dremio only 
>>>>>>>>>>>>>> exposes the
>>>>>>>>>>>>>> "refresh-earliest-table-timestamp-ms" as an fyi to the user but 
>>>>>>>>>>>>>> it would be
>>>>>>>>>>>>>> interesting to allow the user to set this time so that they 
>>>>>>>>>>>>>> could run
>>>>>>>>>>>>>> queries and be 100% certain that they were not seeing data older 
>>>>>>>>>>>>>> than x
>>>>>>>>>>>>>> hours.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>> Benny
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 3:37 PM Steven Wu <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> correction for a typo.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Prashanth brought up another scenario of compaction/rewrite
>>>>>>>>>>>>>>> where a new snapshot was added *with* actual data change
>>>>>>>>>>>>>>> -->
>>>>>>>>>>>>>>> Prashanth brought up another scenario of compaction/rewrite
>>>>>>>>>>>>>>> where a new snapshot was added *without* actual data change
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Oct 8, 2025 at 2:12 PM Steven Wu <
>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks everyone for joining the MV discussion meeting. We
>>>>>>>>>>>>>>>> will continue to have the recurring sync meeting on Wednesday 
>>>>>>>>>>>>>>>> 9 am
>>>>>>>>>>>>>>>> (Pacific) every 3 weeks until we get to the finish line where 
>>>>>>>>>>>>>>>> Jan's MV spec
>>>>>>>>>>>>>>>> PR [1] is merged. I have scheduled our next meeting on Oct 29 
>>>>>>>>>>>>>>>> in the
>>>>>>>>>>>>>>>> Iceberg dev events calendar.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Here is the video recording for today's meeting.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://drive.google.com/file/d/1-nfhBPDWLoAFDu5cKP0rwLd_30HB6byR/view?usp=sharing
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We mostly discussed freshness evaluation. Here is the
>>>>>>>>>>>>>>>> meeting summary.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>    1. For tracking the refresh state for the source MV
>>>>>>>>>>>>>>>>    [2], the consensus is option 2 (treating source MV as a 
>>>>>>>>>>>>>>>> materialized table)
>>>>>>>>>>>>>>>>    which would give engines the flexibility on freshness 
>>>>>>>>>>>>>>>> determination
>>>>>>>>>>>>>>>>    (recursive beyond source MV or not).
>>>>>>>>>>>>>>>>    2. Earlier design doc [3] discussed max staleness
>>>>>>>>>>>>>>>>    config. But it wasn't reflected in the spec PR. The general 
>>>>>>>>>>>>>>>> opinion is to
>>>>>>>>>>>>>>>>    add the config to the spec PR. The open question is whether 
>>>>>>>>>>>>>>>> the `
>>>>>>>>>>>>>>>>    materialization.max-staleness-ms` config should be
>>>>>>>>>>>>>>>>    added to the view metadata or the storage table metadata. 
>>>>>>>>>>>>>>>> Either can work.
>>>>>>>>>>>>>>>>    We just need to decide which makes a little better fit.
>>>>>>>>>>>>>>>>    3. Prashanth brought up schema change with default
>>>>>>>>>>>>>>>>    value and how it may affect the MV refresh state (for SQL 
>>>>>>>>>>>>>>>> representation
>>>>>>>>>>>>>>>>    with select *). Jan mentioned that snapshot contains schema 
>>>>>>>>>>>>>>>> id when the
>>>>>>>>>>>>>>>>    snapshot was created. Engine can compare the snapshot 
>>>>>>>>>>>>>>>> schema id to the
>>>>>>>>>>>>>>>>    source table schema id during freshness evaluation. There 
>>>>>>>>>>>>>>>> is no need for
>>>>>>>>>>>>>>>>    additional schema info in refresh-state tracking in the 
>>>>>>>>>>>>>>>> storage table.
>>>>>>>>>>>>>>>>    4. Prashanth brought up another scenario of
>>>>>>>>>>>>>>>>    compaction/rewrite where a new snapshot was added with 
>>>>>>>>>>>>>>>> actual data change.
>>>>>>>>>>>>>>>>    The general take is that the engine can optimize and decide 
>>>>>>>>>>>>>>>> that MV is
>>>>>>>>>>>>>>>>    fresh as the new snapshot doesn't have any data change.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> We can add some clarifications in the spec PR for freshness
>>>>>>>>>>>>>>>> evaluation based on the above discussions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> [1] https://github.com/apache/iceberg/pull/11041
>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1_StBW5hCQhumhIvgbdsHjyW0ED3dWMkjtNzyPp9Sfr8/edit?tab=t.0
>>>>>>>>>>>>>>>> [3]
>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1UnhldHhe3Grz8JBngwXPA6ZZord1xMedY5ukEhZYF-A/edit?tab=t.0#heading=h.3wigecex0zls
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Sep 25, 2025 at 9:27 AM Steven Wu <
>>>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Iceberg materialized view has been discussed in the
>>>>>>>>>>>>>>>>> community for a long time. Thanks Jan Kaul for driving the 
>>>>>>>>>>>>>>>>> discussion and
>>>>>>>>>>>>>>>>> the spec PR. It has been stalled for a long time due to lack 
>>>>>>>>>>>>>>>>> of consensus
>>>>>>>>>>>>>>>>> on 1 or 2 topics. In Wed's Iceberg community sync meeting, 
>>>>>>>>>>>>>>>>> Talat brought up
>>>>>>>>>>>>>>>>> the question on how to move forward and if we can have a 
>>>>>>>>>>>>>>>>> dedicated meeting
>>>>>>>>>>>>>>>>> for MV.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have set up a meeting on *Oct 8 (9-10 am Pacific)*. If
>>>>>>>>>>>>>>>>> you subscribe to the "Iceberg Dev Events" calendar, you
>>>>>>>>>>>>>>>>> should be able to see it. If not, here is the link:
>>>>>>>>>>>>>>>>> https://meet.google.com/nfe-guyq-pqf
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We are going to discuss
>>>>>>>>>>>>>>>>> * remaining open questions
>>>>>>>>>>>>>>>>> * unresolved concerns
>>>>>>>>>>>>>>>>> * the next step and hopefully some consensus on moving
>>>>>>>>>>>>>>>>> forward
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> MV spec PR is up to date. Jan has incorporated recent
>>>>>>>>>>>>>>>>> feedback. This should be the base of the discussion.
>>>>>>>>>>>>>>>>> https://github.com/apache/iceberg/pull/11041
>>>>>>>>>>>>>>>>> <https://www.google.com/url?q=https://github.com/apache/iceberg/pull/11041&sa=D&source=calendar&usd=2&usg=AOvVaw3w0TjRpwbC17AGzmxZmElM>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Dev discussion thread (a long-running thread started by
>>>>>>>>>>>>>>>>> Jan).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf
>>>>>>>>>>>>>>>>> <https://www.google.com/url?q=https://lists.apache.org/thread/y1vlpzbn2x7xookjkffcl08zzyofk5hf&sa=D&source=calendar&usd=2&usg=AOvVaw0fotlsrnRBOb820mA5JRyB>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The mail archive has broken lineage and doesn't show all
>>>>>>>>>>>>>>>>> replies. Email subject is "*[DISCUSS] Iceberg
>>>>>>>>>>>>>>>>> Materialzied Views*".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> Steven
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Re: Dedicated sync for Iceberg materialized view

Reply via email to