Thanks Steven! So would you agree that resolution using default-catalog and
default-namespace does not provide full determinism, and requires a
supporting safety mechanism?

Thanks,
Walaa.

On Wed, May 7, 2025 at 10:30 PM Steven Wu <stevenz...@gmail.com> wrote:

> > If the current model is considered deterministic, do you think
> `default-catalog` and `default-namespace` fields provide enough determinism
> to eliminate the need for UUIDs when storing table identifiers?
>
> I am fine with storing UUIDs for table identifiers in the view. Basically,
> view creation resolves all referenced tables/views with UUIDs. View
> consumers can validate resolved tables/views with the stored UUIDs and fail
> the query if mismatch.
>
> The UUID change doesn't really change the table identifier resolution rule
> though. It is more of a safety protection.
>
> On Wed, May 7, 2025 at 10:02 PM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> Hi Steven,
>>
>> Thanks for the reply.
>>
>> > I agree with Dan that we shouldn't solve catalog naming in the Iceberg
>> view spec.
>>
>> To clarify, I don't believe the proposal is trying to solve catalog
>> naming. What it’s doing is simply this:
>>
>> * Proposing that table names inside views resolve the same way as they do
>> elsewhere (e.g., queries).
>> * Adopting a model that is already widely used and supported in the
>> existing ecosystem, which allows for:
>>     -- Renaming catalog aliases
>>     -- Swapping catalog implementations behind consistent names
>>     -- Having different default catalog names across engines that still
>> point to the same underlying tables
>>
>> These are common patterns in production data lakes. Saying Iceberg views
>> cannot operate in those environments feels unrealistic. In practice, it
>> means the spec breaks down in situations that users encounter regularly.
>>
>> > The recommendation of using engines’ current catalog and database can
>> cause context-dependent resolution results.
>>
>> * As noted in the doc and earlier replies, fixing a catalog name doesn’t
>> actually guarantee determinism either. All the failure scenarios above
>> still apply even when a default-catalog is stored.
>> * The current spec also allows default-catalog to be null, in which case
>> it falls back to the view’s catalog, yet that catalog is determined based
>> on how the view is referenced in the query, which would be considered
>> non-deterministic based on the same criteria you shared.
>> * The only true form of determinism here is UUID-based validation, which
>> protects against silent drift in any resolution model.
>>
>> If the current model is considered deterministic, do you think
>> `default-catalog` and `default-namespace` fields provide enough determinism
>> to eliminate the need for UUIDs when storing table identifiers?
>> Or put another way: Would you be comfortable relying solely on
>> default-catalog + default-namespace + table name to re-identify the correct
>> table, without UUID validation?
>>
>> +1 on involving other communities. I’m happy to help facilitate a
>> cross-community discussion if we aren’t able to reach a resolution here.
>>
>> Thanks,
>> Walaa.
>>
>>
>>
>> On Wed, May 7, 2025 at 9:20 PM Steven Wu <stevenz...@gmail.com> wrote:
>>
>>> I agree with Dan that we shouldn't solve catalog naming in the Iceberg
>>> view spec. I am not convinced that the proposed change will make the table
>>> identifier resolution more clear and portable. The recommendation of using
>>> engines' current catalog and database can cause context dependent
>>> resolution results, which seems non-deterministic to me.
>>>
>>> Walaa, you raised a point in the doc that the current catalog resolution
>>> logic (default-catalog field, then view catalog) is challenging and
>>> unrealistic for engines (like Spark and Trino). It will be great to get
>>> more inputs from the broader community on this part.
>>>
>>>
>>> On Tue, May 6, 2025 at 9:21 AM Benny Chow <btc...@gmail.com> wrote:
>>>
>>>> In Spark, I believe that the USE commands sets the current catalog and
>>>> namespace.  This affects both where the view is created and how unqualified
>>>> table identifiers are resolved.  I also don't see an issue with saving the
>>>> current catalog and namespace into the view metadata's default-catalog and
>>>> default-namespace fields.
>>>>
>>>> On Wed, Apr 30, 2025 at 5:12 PM Walaa Eldin Moustafa <
>>>> wa.moust...@gmail.com> wrote:
>>>>
>>>>> > I think that's the lesser evil compared to Iceberg specifying how
>>>>> engines should resolve identifiers
>>>>>
>>>>> I think this is also similar to the previous point. It is the other
>>>>> way around. Right now the spec dictates how to resolve (through employing 
>>>>> a
>>>>> view-specific `default-catalog` field). The proposal is suggesting to get
>>>>> out of this space and let engines handle it similar to how they handle all
>>>>> identifiers.
>>>>>
>>>>> On Wed, Apr 30, 2025 at 5:07 PM Walaa Eldin Moustafa <
>>>>> wa.moust...@gmail.com> wrote:
>>>>>
>>>>>> > I thought "default-catalog" could be set via the USE command.
>>>>>>
>>>>>> Benny, I think this is a misconception or miscommunication. The USE
>>>>>> command has no impact on the `default-catalog` field. In fact, the
>>>>>> proposal's direction is exactly to establish that USE command should
>>>>>> influence how tables are resolved, same like everywhere else. Right now 
>>>>>> it
>>>>>> is not the case under the current spec.
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 30, 2025 at 3:17 PM Benny Chow <btc...@gmail.com> wrote:
>>>>>>
>>>>>>> > there is no SQL construct today to explicitly set default-catalog
>>>>>>>
>>>>>>> I thought "default-catalog" could be set via the USE command.
>>>>>>>
>>>>>>> I generally agree with Dan about requiring consistent catalog
>>>>>>> names.  I think that's the lesser evil compared to Iceberg specifying 
>>>>>>> how
>>>>>>> engines should resolve identifiers.  Another thing to consider is that
>>>>>>> identifier resolution can be very expensive at query validation time if
>>>>>>> identifiers need to be looked up from a bunch of places.  Hopefully, it
>>>>>>> should be possible to define a view in such a way that identifiers can 
>>>>>>> be
>>>>>>> resolved on the first try.
>>>>>>>
>>>>>>> Benny
>>>>>>>
>>>>>>> On Tue, Apr 29, 2025 at 10:29 PM Walaa Eldin Moustafa <
>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Rishabh,
>>>>>>>>
>>>>>>>> You're right that the proposal touches on two aspects, and
>>>>>>>> resolution rules are one of them. The other aspect is the proposal's
>>>>>>>> position that table identifiers should be stored in metadata exactly as
>>>>>>>> they appear in the view text (e.g., even if they're two-part or 
>>>>>>>> partially
>>>>>>>> qualified), along with their corresponding UUIDs for validation. This
>>>>>>>> applies both to referenced input tables and the storage table 
>>>>>>>> identifier in
>>>>>>>> materialized views.
>>>>>>>>
>>>>>>>> We may be able to converge on this storage format even if we
>>>>>>>> haven't yet converged on the resolution fallback rules. I believe both
>>>>>>>> resolution strategies currently being discussed would still lead to 
>>>>>>>> storing
>>>>>>>> identifiers in this way.
>>>>>>>>
>>>>>>>> I'm supportive of moving forward with consensus on the identifier
>>>>>>>> storage format. That said, we may continue to run into questions 
>>>>>>>> related to
>>>>>>>> resolution during implementation. For example: Should the storage table
>>>>>>>> identifier follow the same default-catalog and default-namespace 
>>>>>>>> resolution
>>>>>>>> behavior as other table references?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Walaa.
>>>>>>>>
>>>>>>>> On Tue, Apr 29, 2025 at 10:07 PM Rishabh Bhatia <
>>>>>>>> bhatiarishab...@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello Walaa,
>>>>>>>>>
>>>>>>>>> Thanks for starting this discussion.
>>>>>>>>>
>>>>>>>>> I think we should decouple at least the MV Spec from the proposal
>>>>>>>>> to change the current behavior of view resolution.
>>>>>>>>>
>>>>>>>>> We can continue having the discussion if the current view spec
>>>>>>>>> needs to be changed or not. Based on the decision at a later point if
>>>>>>>>> required we can update the view resolution rule.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Rishabh
>>>>>>>>>
>>>>>>>>> On Mon, Apr 28, 2025 at 3:22 PM Walaa Eldin Moustafa <
>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Correction of typo: both engines seem to set default-catalog to
>>>>>>>>>> the view catalog if it is defined, or to null if the view catalog is 
>>>>>>>>>> not
>>>>>>>>>> defined.
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 28, 2025 at 3:06 PM Walaa Eldin Moustafa <
>>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>
>>>>>>>>>>> Thanks again for your response.
>>>>>>>>>>>
>>>>>>>>>>> I agree that catalog renaming is an environmental event, but
>>>>>>>>>>> it's a real one that happens frequently in practice.
>>>>>>>>>>> Saying that the Iceberg spec cannot accommodate something as
>>>>>>>>>>> common as catalog renaming feels very restrictive, and could make 
>>>>>>>>>>> the spec
>>>>>>>>>>> less practical, even unusable, for real-world deployments.
>>>>>>>>>>> I’m sharing this from the perspective of a large data lake
>>>>>>>>>>> environment where views are heavily deployed and operationalized.
>>>>>>>>>>>
>>>>>>>>>>> Further, it's worth noting that the table spec is resilient to
>>>>>>>>>>> catalog renaming, but the view spec is not. If we have an 
>>>>>>>>>>> opportunity to
>>>>>>>>>>> make the view spec similarly resilient, I wonder why not?
>>>>>>>>>>> Both specifications are deterministic in their definition, but
>>>>>>>>>>> one is more fragile to environmental changes than the other. 
>>>>>>>>>>> Improving
>>>>>>>>>>> resilience does not sacrifice determinism. It simply makes views 
>>>>>>>>>>> safer and
>>>>>>>>>>> more portable over time.
>>>>>>>>>>>
>>>>>>>>>>> Separately, given that there is no SQL construct today to
>>>>>>>>>>> explicitly set default-catalog at creation time, what is the 
>>>>>>>>>>> intuition
>>>>>>>>>>> behind how engines like Spark and Trino currently assign 
>>>>>>>>>>> default-catalog?
>>>>>>>>>>> Today, both engines seem to set default-catalog to null if the
>>>>>>>>>>> view catalog is defined, or to the view catalog if not.
>>>>>>>>>>> What was the intended thought process behind this behavior?
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Walaa
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Apr 28, 2025 at 1:33 PM Daniel Weeks <dwe...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Walaa,
>>>>>>>>>>>>
>>>>>>>>>>>> > tables inside views remain reachable after a catalog rename
>>>>>>>>>>>>
>>>>>>>>>>>> This problem stems from the exact environmental/configuration
>>>>>>>>>>>> issue that we should not be trying to address.  I don't think we 
>>>>>>>>>>>> would
>>>>>>>>>>>> expect references to survive a catalog rename.  That's not 
>>>>>>>>>>>> something
>>>>>>>>>>>> covered by the spec and needs to be handled separately as a 
>>>>>>>>>>>> platform-level
>>>>>>>>>>>> migration specific to the affected environment.
>>>>>>>>>>>>
>>>>>>>>>>>> The identifier resolution logic is clear and deterministic.  It
>>>>>>>>>>>> should not matter whether an engine resolves and encodes the
>>>>>>>>>>>> default-catalog or leaves it to the resolution rules.
>>>>>>>>>>>>
>>>>>>>>>>>> The issue isn't with how the spec is defined, but rather view
>>>>>>>>>>>> behavior when you start altering the environment around it, which 
>>>>>>>>>>>> isn't
>>>>>>>>>>>> something we should be trying to define here.
>>>>>>>>>>>>
>>>>>>>>>>>> -Dan
>>>>>>>>>>>>
>>>>>>>>>>>> On Mon, Apr 28, 2025 at 12:17 PM Walaa Eldin Moustafa <
>>>>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Dan,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for chiming in.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I believe the issues we’re seeing now go beyond just catalog
>>>>>>>>>>>>> naming consistency. The behavior around default-catalog itself 
>>>>>>>>>>>>> introduces
>>>>>>>>>>>>> resolution inconsistencies even when catalog names are consistent.
>>>>>>>>>>>>> For example:
>>>>>>>>>>>>>
>>>>>>>>>>>>> * When default-catalog is set to null, tables inside views
>>>>>>>>>>>>> remain reachable after a catalog rename. But if it is set to a 
>>>>>>>>>>>>> non-null
>>>>>>>>>>>>> value, table references will break.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * default-catalog causes table references inside views to be
>>>>>>>>>>>>> early bound (i.e., bound at view creation time, especially when 
>>>>>>>>>>>>> using a
>>>>>>>>>>>>> non-null value), while table references inside standalone queries 
>>>>>>>>>>>>> are late
>>>>>>>>>>>>> bound (bound at query time). This creates inconsistencies when 
>>>>>>>>>>>>> resolving
>>>>>>>>>>>>> the same table name inside and outside views, even within the 
>>>>>>>>>>>>> same job.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * It causes Spark's and Trino behavior to drift from the spec.
>>>>>>>>>>>>> There is no way to fully align Spark's behavior without making 
>>>>>>>>>>>>> invasive
>>>>>>>>>>>>> changes to the Spark SQL grammar and the View DataSource API 
>>>>>>>>>>>>> (specifically
>>>>>>>>>>>>> on the CREATE side). This challenge would extend to other engines 
>>>>>>>>>>>>> too. Both
>>>>>>>>>>>>> Spark and Trino set this field based on a heuristic in today's
>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * With view nesting (views depending on views), these
>>>>>>>>>>>>> inconsistencies amplify further, forcing users and engines to 
>>>>>>>>>>>>> reason about
>>>>>>>>>>>>> catalog resolution at every level in the view tree.
>>>>>>>>>>>>>
>>>>>>>>>>>>> * It will be difficult to migrate Hive views to Iceberg with
>>>>>>>>>>>>> that model. Migrated Hive views will have to unfollow that spec.
>>>>>>>>>>>>>
>>>>>>>>>>>>> How would you suggest approaching the engine-level changes
>>>>>>>>>>>>> required to support the current default-catalog field?
>>>>>>>>>>>>> Also, do you believe the Spark and Trino communities would
>>>>>>>>>>>>> align around having table resolution behave inconsistently 
>>>>>>>>>>>>> between queries
>>>>>>>>>>>>> and views, or inconsistency between Iceberg and other types of 
>>>>>>>>>>>>> views?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Walaa
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Mon, Apr 28, 2025 at 11:34 AM Daniel Weeks <
>>>>>>>>>>>>> dwe...@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I would agree with Jan's summary of why 'default-catalog' was
>>>>>>>>>>>>>> introduced, but I think we need to step back and align on what 
>>>>>>>>>>>>>> we are
>>>>>>>>>>>>>> really attempting to support in the spec.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The issues we're discussing largely stem from using multiple
>>>>>>>>>>>>>> engines with cross catalog references and configurations where 
>>>>>>>>>>>>>> catalog
>>>>>>>>>>>>>> names are not aligned.  If we have multiple engines that all 
>>>>>>>>>>>>>> have the same
>>>>>>>>>>>>>> catalog names/configurations, the current spec implementation is 
>>>>>>>>>>>>>> well
>>>>>>>>>>>>>> defined for table resolution even across catalogs.  The 
>>>>>>>>>>>>>> 'default-catalog'
>>>>>>>>>>>>>> (and namespace equivalent) was intended to address the 
>>>>>>>>>>>>>> resolution within
>>>>>>>>>>>>>> the context of the sql text, not to address catalog/naming 
>>>>>>>>>>>>>> inconsistencies.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I feel like we're trying to adapt the original intent to
>>>>>>>>>>>>>> address the catalog naming/configuration and would argue that we 
>>>>>>>>>>>>>> shouldn't
>>>>>>>>>>>>>> attempt to do that as part of the spec.  Inconsistently named 
>>>>>>>>>>>>>> catalogs are
>>>>>>>>>>>>>> a reality, but we should consider that a 
>>>>>>>>>>>>>> configuration/environmental issue,
>>>>>>>>>>>>>> not something to solve for in the spec.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> We should support and advocate for consistency in catalog
>>>>>>>>>>>>>> naming and define the spec along those lines.  The fact is that 
>>>>>>>>>>>>>> with all of
>>>>>>>>>>>>>> the recent work that's gone into making catalogs pluggable, it 
>>>>>>>>>>>>>> makes more
>>>>>>>>>>>>>> sense to just register catalog configuration with consistent 
>>>>>>>>>>>>>> names (even if
>>>>>>>>>>>>>> you have to duplicate the configuration for supporting existing
>>>>>>>>>>>>>> readers/writers).  I think it's better to provide a path toward 
>>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>> than to normalize complicated schemes to workaround the issues 
>>>>>>>>>>>>>> caused by
>>>>>>>>>>>>>> environmental/configuration inconsistencies.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If the goal is to create clever ways to hack the late binding
>>>>>>>>>>>>>> resolution to swap in different catalogs or make references 
>>>>>>>>>>>>>> contextual, I
>>>>>>>>>>>>>> feel like that is something we should strongly discourage as it 
>>>>>>>>>>>>>> leads to
>>>>>>>>>>>>>> confusion about what is resolved as part of the query.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> At this point, I don't see a good argument to add
>>>>>>>>>>>>>> additional configuration or change the resolution behaviors.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -Dan
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Mon, Apr 28, 2025 at 12:40 AM Jan Kaul
>>>>>>>>>>>>>> <jank...@mailbox.org.invalid> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think the intention with the "default-catalog" was that
>>>>>>>>>>>>>>> every query engine uses it to store its session default catalog 
>>>>>>>>>>>>>>> at the time
>>>>>>>>>>>>>>> of creating the view. This way the view could be reused in 
>>>>>>>>>>>>>>> another session.
>>>>>>>>>>>>>>> The idea was not to introduce an additional SQL syntax to set 
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> default-catalog.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Generally we have different environments we want to support
>>>>>>>>>>>>>>> with the view spec:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Consistent catalog naming
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When the environment supports it, using consistent catalog
>>>>>>>>>>>>>>> names can have a great benefit for multi-catalog, multi-engine 
>>>>>>>>>>>>>>> setups. With
>>>>>>>>>>>>>>> consistent catalog names, using the "default-catalog" field 
>>>>>>>>>>>>>>> works without
>>>>>>>>>>>>>>> any issues.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. Inconsistent catalog naming
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This can be the case when different query engines refer to
>>>>>>>>>>>>>>> the same physical catalog by different names. This often 
>>>>>>>>>>>>>>> happens because
>>>>>>>>>>>>>>> different query engines use different strategies to setup the 
>>>>>>>>>>>>>>> catalogs. If
>>>>>>>>>>>>>>> catalogs have inconsistent naming, using the "default-catalog" 
>>>>>>>>>>>>>>> field does
>>>>>>>>>>>>>>> not work because it is not guaranteed that the catalog name can 
>>>>>>>>>>>>>>> be resolved
>>>>>>>>>>>>>>> with another engine. Using the "view catalog" as a fallback is 
>>>>>>>>>>>>>>> a better
>>>>>>>>>>>>>>> solution for this use case, as it avoids catalog names 
>>>>>>>>>>>>>>> altogether. It is
>>>>>>>>>>>>>>> however limited to table references in the same catalog.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What do you think of introducing a view property that
>>>>>>>>>>>>>>> specifies if the "default-catalog" or the "view catalog" should 
>>>>>>>>>>>>>>> be used?
>>>>>>>>>>>>>>> This way, you could use the "default-catalog" in environments 
>>>>>>>>>>>>>>> where you can
>>>>>>>>>>>>>>> guarantee consistent naming, but you would be able to directly 
>>>>>>>>>>>>>>> fallback to
>>>>>>>>>>>>>>> the "view-catalog" when you don't have consistent naming. The 
>>>>>>>>>>>>>>> query engines
>>>>>>>>>>>>>>> could set the default for this view property at creation time. 
>>>>>>>>>>>>>>> Spark for
>>>>>>>>>>>>>>> example could set it to automatically use the "view catalog".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Jan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On 4/26/25 05:33, Walaa Eldin Moustafa wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To help folks catch up on the latest discussions and
>>>>>>>>>>>>>>> interpretation of the spec, I have summarized everything we 
>>>>>>>>>>>>>>> discussed so
>>>>>>>>>>>>>>> far at the top of the proposal document (here
>>>>>>>>>>>>>>> <https://docs.google.com/document/d/1-I2v_OqBgJi_8HVaeH1u2jowghmXoB8XaJLzPBa_Hg8/edit?tab=t.0>).
>>>>>>>>>>>>>>> I have slightly updated the proposal to be in sync with the new
>>>>>>>>>>>>>>> interpretation to avoid confusion. In summary:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Remove default-catalog and default-namespace fields from
>>>>>>>>>>>>>>> the view spec completely.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Hence, we do not attempt to define separate view-level
>>>>>>>>>>>>>>> default catalogs or namespaces.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Instead:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * If a table identifier inside a view lacks a catalog
>>>>>>>>>>>>>>> qualifier, engines should resolve it using the current engine 
>>>>>>>>>>>>>>> catalog at
>>>>>>>>>>>>>>> query time.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Reference table identifiers in the metadata exactly as
>>>>>>>>>>>>>>> they appear in the view SQL text.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * If an identifier lacks the catalog part at creation, it
>>>>>>>>>>>>>>> should still lack a catalog in the stored metadata.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> * Store UUIDs alongside table identifiers whenever possible.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Apr 25, 2025 at 5:18 PM Walaa Eldin Moustafa <
>>>>>>>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for the contribution Benny! +1 to the confusion the
>>>>>>>>>>>>>>>> fallback creates. Also just to be clear, at this point and 
>>>>>>>>>>>>>>>> after clarifying
>>>>>>>>>>>>>>>> the current spec intentions, I am convinced that we should 
>>>>>>>>>>>>>>>> remove the
>>>>>>>>>>>>>>>> default catalog and default namespace fields altogether.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Fri, Apr 25, 2025 at 5:13 PM Benny Chow <
>>>>>>>>>>>>>>>> btc...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'd like to contribute my opinions on this:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - I don't particularly like the current behavior of
>>>>>>>>>>>>>>>>> "default to the view's catalog when default-catalog is not 
>>>>>>>>>>>>>>>>> set".
>>>>>>>>>>>>>>>>> Fundamentally, I believe the intent of default-catalog and
>>>>>>>>>>>>>>>>> default-namespace is there to help users write more concise 
>>>>>>>>>>>>>>>>> SQL.
>>>>>>>>>>>>>>>>> - spark session catalog is engine specific and I don't
>>>>>>>>>>>>>>>>> think we should design something that says first use this 
>>>>>>>>>>>>>>>>> catalog, then
>>>>>>>>>>>>>>>>> that catalog.. or that catalog.  For example, resolving 
>>>>>>>>>>>>>>>>> identifiers using
>>>>>>>>>>>>>>>>> default-catalog -> view's catalog -> session catalog is not 
>>>>>>>>>>>>>>>>> good.
>>>>>>>>>>>>>>>>> - We gotta support non-Iceberg tables otherwise I see no
>>>>>>>>>>>>>>>>> value in putting views in the catalog to share with other 
>>>>>>>>>>>>>>>>> engines
>>>>>>>>>>>>>>>>> - Interoperability between different engine types is very
>>>>>>>>>>>>>>>>> hard due to dialect issues... so I think we should focus on 
>>>>>>>>>>>>>>>>> supporting
>>>>>>>>>>>>>>>>> different clusters of the same engine type on a shared 
>>>>>>>>>>>>>>>>> catalog.  For
>>>>>>>>>>>>>>>>> example, AI and BI clusters on Spark sharing the same views 
>>>>>>>>>>>>>>>>> in a REST
>>>>>>>>>>>>>>>>> catalog.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Coincidentally, I think the ultimate solution is along the
>>>>>>>>>>>>>>>>> lines of something Russell proposed last year:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://lists.apache.org/thread/hoskfx8y3kvrcww52l4w9dxghp3pnlm7
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> We've been looking at this interoperable identifier
>>>>>>>>>>>>>>>>> problem through the lens of catalog resolution but maybe the 
>>>>>>>>>>>>>>>>> right approach
>>>>>>>>>>>>>>>>> is really about templating.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I would extend Russell's idea to allow identifiers in a
>>>>>>>>>>>>>>>>> view to span catalogs to support non-Iceberg tables.   Also, 
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> default-catalog property could be templated as well.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thoughts?
>>>>>>>>>>>>>>>>> Benny
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Fri, Apr 25, 2025 at 4:02 PM Walaa Eldin Moustafa <
>>>>>>>>>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks Steven! How do you recommend making Spark
>>>>>>>>>>>>>>>>>> implementation conform to the spec? Do we need Spark SQL 
>>>>>>>>>>>>>>>>>> extensions and/or
>>>>>>>>>>>>>>>>>> Spark catalog APIs for that?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> How do you recommend reconciling the inconsistencies I
>>>>>>>>>>>>>>>>>> shared regarding many resolution methods not consistently 
>>>>>>>>>>>>>>>>>> being followed in
>>>>>>>>>>>>>>>>>> different scenarios (view vs child table resolution, query 
>>>>>>>>>>>>>>>>>> vs view
>>>>>>>>>>>>>>>>>> resolution)? Note these occur when the default catalog is 
>>>>>>>>>>>>>>>>>> set to a non-null
>>>>>>>>>>>>>>>>>> value. If it helps, I can share concrete examples.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Fri, Apr 25, 2025 at 3:52 PM Steven Wu <
>>>>>>>>>>>>>>>>>> stevenz...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The core issue is on the fall back behavior when
>>>>>>>>>>>>>>>>>>> `default-catalog` is
>>>>>>>>>>>>>>>>>>> not defined. Current view spec says the fallback should
>>>>>>>>>>>>>>>>>>> be the catalog
>>>>>>>>>>>>>>>>>>> where the view is defined. It doesn't really matter what
>>>>>>>>>>>>>>>>>>> the catalog
>>>>>>>>>>>>>>>>>>> is named (catalogX) by the read engine.
>>>>>>>>>>>>>>>>>>> - If a view refers to the tables in the same catalog,
>>>>>>>>>>>>>>>>>>> this is a
>>>>>>>>>>>>>>>>>>> non-ambiguous and reasonable fallback behavior.
>>>>>>>>>>>>>>>>>>> - If a view refers to tables from another catalog,
>>>>>>>>>>>>>>>>>>> catalog names
>>>>>>>>>>>>>>>>>>> should be included in the reference name already. So no
>>>>>>>>>>>>>>>>>>> ambiguity
>>>>>>>>>>>>>>>>>>> there either.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Potential inconsistent naming of catalog is a separate
>>>>>>>>>>>>>>>>>>> problem, which
>>>>>>>>>>>>>>>>>>> Iceberg view spec probably cannot solve. We can only
>>>>>>>>>>>>>>>>>>> recommend that
>>>>>>>>>>>>>>>>>>> catalog should be named consistently across usage for
>>>>>>>>>>>>>>>>>>> better
>>>>>>>>>>>>>>>>>>> interoperability on name references.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> This proposal is to change the fallback behavior to
>>>>>>>>>>>>>>>>>>> engine's session
>>>>>>>>>>>>>>>>>>> default catalog. I am not sure it is better than the
>>>>>>>>>>>>>>>>>>> current fallback
>>>>>>>>>>>>>>>>>>> behavior.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> > Today’s Spark behavior explicitly differs from this
>>>>>>>>>>>>>>>>>>> idea. Spark resolves table identifiers during view creation 
>>>>>>>>>>>>>>>>>>> using the
>>>>>>>>>>>>>>>>>>> session’s default catalog, not a supplied `default-catalog`.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I would argue that is a Spark implementation issue for
>>>>>>>>>>>>>>>>>>> not conforming
>>>>>>>>>>>>>>>>>>> to the spec.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Fri, Apr 25, 2025 at 1:17 PM Walaa Eldin Moustafa
>>>>>>>>>>>>>>>>>>> <wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Hi Jan,
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Thanks again for continuing the discussion. I want to
>>>>>>>>>>>>>>>>>>> highlight a few fundamental issues around the 
>>>>>>>>>>>>>>>>>>> interpretation of
>>>>>>>>>>>>>>>>>>> default-catalog:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Here is the real catch:
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > * default-catalog cannot logically be defined at view
>>>>>>>>>>>>>>>>>>> creation time. It would be circular: the view needs to 
>>>>>>>>>>>>>>>>>>> exist before its
>>>>>>>>>>>>>>>>>>> metadata (and hence default-catalog) can exist. This is 
>>>>>>>>>>>>>>>>>>> visible in Spark’s
>>>>>>>>>>>>>>>>>>> implementation, where `default-catalog` is not used.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > * Introducing a creation-time default-catalog setting
>>>>>>>>>>>>>>>>>>> would require extending SQL syntax and engine APIs to 
>>>>>>>>>>>>>>>>>>> promote it to a
>>>>>>>>>>>>>>>>>>> first-class view concept. This would be intrusive, 
>>>>>>>>>>>>>>>>>>> non-intuitive, and
>>>>>>>>>>>>>>>>>>> realistically very difficult to standardize across engines.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > * Today’s Spark behavior explicitly differs from this
>>>>>>>>>>>>>>>>>>> idea. Spark resolves table identifiers during view creation 
>>>>>>>>>>>>>>>>>>> using the
>>>>>>>>>>>>>>>>>>> session’s default catalog, not a supplied `default-catalog`.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > * Hypothetically even if we patched in a creation-time
>>>>>>>>>>>>>>>>>>> default-catalog, it would create an inconsistent binding 
>>>>>>>>>>>>>>>>>>> model between
>>>>>>>>>>>>>>>>>>> tables vs views (early vs late), and between tables in 
>>>>>>>>>>>>>>>>>>> views and in queries
>>>>>>>>>>>>>>>>>>> (again early vs late). For example, views and tables in 
>>>>>>>>>>>>>>>>>>> queries can
>>>>>>>>>>>>>>>>>>> withstand default catalog renames, but tables cannot when 
>>>>>>>>>>>>>>>>>>> they are used
>>>>>>>>>>>>>>>>>>> inside views -- it even applies to views inside views, 
>>>>>>>>>>>>>>>>>>> which makes this
>>>>>>>>>>>>>>>>>>> very hard to reason about considering nesting.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > Thanks,
>>>>>>>>>>>>>>>>>>> > Walaa
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > On Fri, Apr 25, 2025 at 7:00 AM Jan Kaul
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid>
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid> wrote:
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> @Walaa:
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> I would argue that when you run a CREATE VIEW
>>>>>>>>>>>>>>>>>>> statement the query engine knowns which catalog the view is 
>>>>>>>>>>>>>>>>>>> being created
>>>>>>>>>>>>>>>>>>> in. So even though we typically use late binding to resolve 
>>>>>>>>>>>>>>>>>>> the view
>>>>>>>>>>>>>>>>>>> catalog at query time, it can also be used at creation time.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> The query engine would need to keep track of the
>>>>>>>>>>>>>>>>>>> "view catalog" where the view is going to be created in. It 
>>>>>>>>>>>>>>>>>>> can use that
>>>>>>>>>>>>>>>>>>> catalog to resolve partial table identifiers if 
>>>>>>>>>>>>>>>>>>> "default-catalog" is not
>>>>>>>>>>>>>>>>>>> set.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> It can lead to some unintuitive behavior, where
>>>>>>>>>>>>>>>>>>> partial identifiers in the view query resolve to a 
>>>>>>>>>>>>>>>>>>> different catalog
>>>>>>>>>>>>>>>>>>> compared to using them outside of a view.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> CREATE VIEW catalogA.sales.monthly_orders AS SELECT *
>>>>>>>>>>>>>>>>>>> from sales.orders;
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> If the session default catalog is not "catalogA", the
>>>>>>>>>>>>>>>>>>> "sales.orders" in the view query would not be the same as 
>>>>>>>>>>>>>>>>>>> just referencing
>>>>>>>>>>>>>>>>>>> "sales.orders" in a normal SQL statement. This is because 
>>>>>>>>>>>>>>>>>>> without a
>>>>>>>>>>>>>>>>>>> "default-catalog", the catalog name of "sales.orders" would 
>>>>>>>>>>>>>>>>>>> default to
>>>>>>>>>>>>>>>>>>> "catalogA", which is the view's catalog.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thanks,
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Jan
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> On 4/25/25 04:05, Manu Zhang wrote:
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> For example, if we want to validate that the tables
>>>>>>>>>>>>>>>>>>> referenced in the view exist, how can we do that when 
>>>>>>>>>>>>>>>>>>> default-catalog isn't
>>>>>>>>>>>>>>>>>>> defined, since the view hasn't been created or loaded yet?
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> I don't think this is related to view spec. How do we
>>>>>>>>>>>>>>>>>>> validate that a table exists without a default catalog, or 
>>>>>>>>>>>>>>>>>>> do we always use
>>>>>>>>>>>>>>>>>>> the current session catalog?
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thanks,
>>>>>>>>>>>>>>>>>>> >> Manu
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> On Fri, Apr 25, 2025 at 5:59 AM Walaa Eldin Moustafa <
>>>>>>>>>>>>>>>>>>> wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> Hi Jan,
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> I think we still share the same understanding. Just
>>>>>>>>>>>>>>>>>>> to clarify: when I referred to late binding as “similar” to 
>>>>>>>>>>>>>>>>>>> the proposal, I
>>>>>>>>>>>>>>>>>>> was acknowledging the distinction between view-level and 
>>>>>>>>>>>>>>>>>>> table-level
>>>>>>>>>>>>>>>>>>> resolution. But as you noted, both follow a late binding 
>>>>>>>>>>>>>>>>>>> model.
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> That said, this still raises an interesting question
>>>>>>>>>>>>>>>>>>> and a potential gap: if default-catalog is only defined at 
>>>>>>>>>>>>>>>>>>> query time, how
>>>>>>>>>>>>>>>>>>> should resolution work during view creation? For example, 
>>>>>>>>>>>>>>>>>>> if we want to
>>>>>>>>>>>>>>>>>>> validate that the tables referenced in the view exist, how 
>>>>>>>>>>>>>>>>>>> can we do that
>>>>>>>>>>>>>>>>>>> when default-catalog isn't defined, since the view hasn't 
>>>>>>>>>>>>>>>>>>> been created or
>>>>>>>>>>>>>>>>>>> loaded yet?
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> Thanks,
>>>>>>>>>>>>>>>>>>> >>> Walaa.
>>>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>>>> >>> On Thu, Apr 24, 2025 at 7:02 AM Jan Kaul
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid>
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid> wrote:
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Yes, I have the same understanding. The view
>>>>>>>>>>>>>>>>>>> catalog is resolved at query time.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> As you mentioned before, it's good to distinguish
>>>>>>>>>>>>>>>>>>> between the physical catalog and it's reference used in SQL 
>>>>>>>>>>>>>>>>>>> statements. The
>>>>>>>>>>>>>>>>>>> important part is that the physical catalog of the view and 
>>>>>>>>>>>>>>>>>>> the tables
>>>>>>>>>>>>>>>>>>> referenced in it's definition stay consistent. You could 
>>>>>>>>>>>>>>>>>>> create a view in a
>>>>>>>>>>>>>>>>>>> given physical catalog by referring to it as "catalogA", as 
>>>>>>>>>>>>>>>>>>> in your first
>>>>>>>>>>>>>>>>>>> point. If you then, given a different setup, refer to the 
>>>>>>>>>>>>>>>>>>> same physical
>>>>>>>>>>>>>>>>>>> catalog as "catalogB" in another session/environment, the 
>>>>>>>>>>>>>>>>>>> behavior should
>>>>>>>>>>>>>>>>>>> still work.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> I would however rephrase your last point. Late
>>>>>>>>>>>>>>>>>>> binding applies to the view catalog name and by extension 
>>>>>>>>>>>>>>>>>>> to all partial
>>>>>>>>>>>>>>>>>>> table references when no "default-catalog" is present. 
>>>>>>>>>>>>>>>>>>> Resolving the view
>>>>>>>>>>>>>>>>>>> catalog name at query time is not opposed to storing the 
>>>>>>>>>>>>>>>>>>> view metadata in a
>>>>>>>>>>>>>>>>>>> catalog.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Or maybe I don't entirely understand what you mean.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Thanks
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Jan
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> On 4/24/25 00:32, Walaa Eldin Moustafa wrote:
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Hi Jan,
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> > The view is executed when it's being referenced
>>>>>>>>>>>>>>>>>>> in a SQL statement. That statement contains the information 
>>>>>>>>>>>>>>>>>>> for the query
>>>>>>>>>>>>>>>>>>> engine to resolve the catalog of the view.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> If I’m understanding correctly, that means:
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> * If the view is queried as SELECT * FROM
>>>>>>>>>>>>>>>>>>> catalogA.namespace.view, then catalogA is considered the 
>>>>>>>>>>>>>>>>>>> view’s catalog.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> * If the same view is later queried as SELECT *
>>>>>>>>>>>>>>>>>>> FROM catalogB.namespace.view (after renaming catalogA to 
>>>>>>>>>>>>>>>>>>> catalogB, and
>>>>>>>>>>>>>>>>>>> keeping everything else the same), then catalogB becomes 
>>>>>>>>>>>>>>>>>>> the view’s catalog.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Is that interpretation correct? If so, it sounds to
>>>>>>>>>>>>>>>>>>> me like the catalog is resolved at query time, based on how 
>>>>>>>>>>>>>>>>>>> the view is
>>>>>>>>>>>>>>>>>>> referenced, not from any stored metadata. That would imply 
>>>>>>>>>>>>>>>>>>> some sort of a
>>>>>>>>>>>>>>>>>>> late binding behavior (similar to the proposal), as opposed 
>>>>>>>>>>>>>>>>>>> to using some
>>>>>>>>>>>>>>>>>>> catalog that "stores" the view definition.
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>> Walaa
>>>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>>>> >>>> On Tue, Apr 22, 2025 at 11:01 AM Jan Kaul
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid>
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Hi Walaa,
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Thanks for clarifying the aspects of
>>>>>>>>>>>>>>>>>>> non-determinism. Let me try to address your questions.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> 1. This is my interpretation of the current spec:
>>>>>>>>>>>>>>>>>>> The view is executed when it's being referenced in a SQL 
>>>>>>>>>>>>>>>>>>> statement. That
>>>>>>>>>>>>>>>>>>> statement contains the information for the query engine to 
>>>>>>>>>>>>>>>>>>> resolve the
>>>>>>>>>>>>>>>>>>> catalog of the view. The query engine then uses that 
>>>>>>>>>>>>>>>>>>> information to fetch
>>>>>>>>>>>>>>>>>>> the view metadata from the catalog. It also needs to 
>>>>>>>>>>>>>>>>>>> temporarily keep track
>>>>>>>>>>>>>>>>>>> of which catalog it used to fetch the view metadata. It can 
>>>>>>>>>>>>>>>>>>> then use that
>>>>>>>>>>>>>>>>>>> information to resolve the table references in the views 
>>>>>>>>>>>>>>>>>>> SQL definition in
>>>>>>>>>>>>>>>>>>> case no default catalog is specified.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> 2. The important part is that the catalog can be
>>>>>>>>>>>>>>>>>>> referenced at execution time. As long as that's the case I 
>>>>>>>>>>>>>>>>>>> would assume the
>>>>>>>>>>>>>>>>>>> view can be created in any catalog.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> I think your point is really valuable because the
>>>>>>>>>>>>>>>>>>> current specification can lead to some unintuitive 
>>>>>>>>>>>>>>>>>>> behavior. For example
>>>>>>>>>>>>>>>>>>> for the following statement:
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> CREATE VIEW catalogA.sales.monthly_orders AS
>>>>>>>>>>>>>>>>>>> SELECT * from sales.orders;
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> If the session default catalog is not "catalogA",
>>>>>>>>>>>>>>>>>>> the "sales.orders" in the view query would not be the same 
>>>>>>>>>>>>>>>>>>> as just
>>>>>>>>>>>>>>>>>>> referencing "sales.orders" in a normal SQL statement. This 
>>>>>>>>>>>>>>>>>>> is because
>>>>>>>>>>>>>>>>>>> without a "default-catalog", the catalog name of 
>>>>>>>>>>>>>>>>>>> "sales.orders" would
>>>>>>>>>>>>>>>>>>> default to "catalogA".
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> However, I like the current design of the view
>>>>>>>>>>>>>>>>>>> spec, because it has the "closure" property. Because of the 
>>>>>>>>>>>>>>>>>>> fact that the
>>>>>>>>>>>>>>>>>>> "view catalog" has to be known when executing a view, all 
>>>>>>>>>>>>>>>>>>> the information
>>>>>>>>>>>>>>>>>>> required to resolve the table identifiers is contained in 
>>>>>>>>>>>>>>>>>>> the view metadata
>>>>>>>>>>>>>>>>>>> (and the "view catalog"). I think that if you make the 
>>>>>>>>>>>>>>>>>>> identifier
>>>>>>>>>>>>>>>>>>> resolution dependent on external parameters, it hinders 
>>>>>>>>>>>>>>>>>>> portability.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Jan
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> On 4/22/25 18:36, Walaa Eldin Moustafa wrote:
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Hi Jan,
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Thanks for the thoughtful feedback.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> I think it’s important we clarify a key point
>>>>>>>>>>>>>>>>>>> before going deeper:
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Non-determinism is not caused by session fallback
>>>>>>>>>>>>>>>>>>> behavior—it’s a fundamental limitation of using table 
>>>>>>>>>>>>>>>>>>> identifiers alone,
>>>>>>>>>>>>>>>>>>> regardless of whether we use the current rule, the proposed 
>>>>>>>>>>>>>>>>>>> fallback to the
>>>>>>>>>>>>>>>>>>> session’s default catalog, or even early vs. late binding.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> The same fully qualified identifier (e.g.,
>>>>>>>>>>>>>>>>>>> catalogA.namespace.table) can resolve to different objects 
>>>>>>>>>>>>>>>>>>> depending solely
>>>>>>>>>>>>>>>>>>> on engine-specific routing logic or catalog aliases. So 
>>>>>>>>>>>>>>>>>>> determinism isn’t
>>>>>>>>>>>>>>>>>>> guaranteed just because an identifier is "fully qualified." 
>>>>>>>>>>>>>>>>>>> The only
>>>>>>>>>>>>>>>>>>> reliable anchor for identity is the UUID. That’s why the 
>>>>>>>>>>>>>>>>>>> proposed use of
>>>>>>>>>>>>>>>>>>> UUIDs is not just a hardening strategy. It’s the actual fix 
>>>>>>>>>>>>>>>>>>> for correctness.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> To move the conversation forward, could you help
>>>>>>>>>>>>>>>>>>> clarify two things in the context of the current spec:
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> * Where in the metadata is the “view catalog”
>>>>>>>>>>>>>>>>>>> stored, so that an engine knows to fall back to it if 
>>>>>>>>>>>>>>>>>>> default-catalog is
>>>>>>>>>>>>>>>>>>> null?
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> * Are we even allowed to create views in the
>>>>>>>>>>>>>>>>>>> session's default catalog (i.e., without specifying a 
>>>>>>>>>>>>>>>>>>> catalog) in the
>>>>>>>>>>>>>>>>>>> current Iceberg spec?
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> These questions are important because if we can’t
>>>>>>>>>>>>>>>>>>> unambiguously recover the "view catalog" from metadata, 
>>>>>>>>>>>>>>>>>>> then defaulting to
>>>>>>>>>>>>>>>>>>> it is problematic. And if views can't be created in the 
>>>>>>>>>>>>>>>>>>> default catalog,
>>>>>>>>>>>>>>>>>>> then the fallback rule doesn’t generalize.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>> Walaa.
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>>>> >>>>> On Tue, Apr 22, 2025 at 3:14 AM Jan Kaul
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid>
>>>>>>>>>>>>>>>>>>> <jank...@mailbox.org.invalid> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Hi Walaa,
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> thank you for your proposal. If I understood
>>>>>>>>>>>>>>>>>>> correctly, you proposal is composed of three parts:
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> - session default catalog as fallback for
>>>>>>>>>>>>>>>>>>> "default-catalog"
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> - session default namespace as fallback for
>>>>>>>>>>>>>>>>>>> "default-namepace"
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> - Late binding + UUID validation
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> I have some comments regarding these points.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> 1. Session default catalog as fallback for
>>>>>>>>>>>>>>>>>>> "default-catalog"
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Introducing a behavior that depends on the
>>>>>>>>>>>>>>>>>>> current session setup is in my opinion the definition of 
>>>>>>>>>>>>>>>>>>> "non-determinism".
>>>>>>>>>>>>>>>>>>> You could be running the same query-engine and 
>>>>>>>>>>>>>>>>>>> catalog-setup on different
>>>>>>>>>>>>>>>>>>> days, with different default session catalogs (which is 
>>>>>>>>>>>>>>>>>>> rather common), and
>>>>>>>>>>>>>>>>>>> would be getting different results.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Whereas with the current behavior, the view
>>>>>>>>>>>>>>>>>>> always produces the same results. The current behavior has 
>>>>>>>>>>>>>>>>>>> some rough edges
>>>>>>>>>>>>>>>>>>> in very niche use cases but I think is solid for most uses 
>>>>>>>>>>>>>>>>>>> cases.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> 2. Session default namespace as fallback for
>>>>>>>>>>>>>>>>>>> "default-namespace"
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Similar to the above.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> 3. Late binding + UUID validation
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> If I understand it correctly, the current
>>>>>>>>>>>>>>>>>>> implementation already uses late binding.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Generally, having UUID validation makes the setup
>>>>>>>>>>>>>>>>>>> more robust. Which is great. However, having UUID 
>>>>>>>>>>>>>>>>>>> validation still requires
>>>>>>>>>>>>>>>>>>> us to have a portable table identifier specification. Even 
>>>>>>>>>>>>>>>>>>> if we have the
>>>>>>>>>>>>>>>>>>> UUIDs of the referenced tables from the view, there simply 
>>>>>>>>>>>>>>>>>>> isn't an
>>>>>>>>>>>>>>>>>>> interface that let's us use those UUIDs. The catalog 
>>>>>>>>>>>>>>>>>>> interface is defined
>>>>>>>>>>>>>>>>>>> in terms of table identifiers.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> So we always require a working catalog setup and
>>>>>>>>>>>>>>>>>>> suiting table identifiers to obtain the table metadata. We 
>>>>>>>>>>>>>>>>>>> can use the
>>>>>>>>>>>>>>>>>>> UUIDs to verify if we loaded the correct table. But this 
>>>>>>>>>>>>>>>>>>> can only be done
>>>>>>>>>>>>>>>>>>> after we used some identifier. Which means there is no way 
>>>>>>>>>>>>>>>>>>> of using UUIDs
>>>>>>>>>>>>>>>>>>> without a functioning catalog/identifier setup.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> In conclusion, I prefer the current behavior for
>>>>>>>>>>>>>>>>>>> "default-catalog" because it is more deterministic in my 
>>>>>>>>>>>>>>>>>>> opinion. And I
>>>>>>>>>>>>>>>>>>> think the current spec does a good job for multi-engine 
>>>>>>>>>>>>>>>>>>> table identifier
>>>>>>>>>>>>>>>>>>> resolution. I see the UUID validation more of an additional 
>>>>>>>>>>>>>>>>>>> hardening
>>>>>>>>>>>>>>>>>>> strategy.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Thanks
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Jan
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> On 4/21/25 17:38, Walaa Eldin Moustafa wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Thanks Renjie!
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> The existing spec has some guidance on resolving
>>>>>>>>>>>>>>>>>>> catalogs on the fly already (to address the case of view 
>>>>>>>>>>>>>>>>>>> text with table
>>>>>>>>>>>>>>>>>>> identifiers missing the catalog part). The guidance is to 
>>>>>>>>>>>>>>>>>>> use the catalog
>>>>>>>>>>>>>>>>>>> where the view is stored. But I find this rule hard to 
>>>>>>>>>>>>>>>>>>> interpret or use.
>>>>>>>>>>>>>>>>>>> The catalog itself is a logical construct—such as a 
>>>>>>>>>>>>>>>>>>> federated catalog that
>>>>>>>>>>>>>>>>>>> delegates to multiple physical backends (e.g., HMS and 
>>>>>>>>>>>>>>>>>>> REST). In such
>>>>>>>>>>>>>>>>>>> cases, the catalog (e.g., `my_catalog` in 
>>>>>>>>>>>>>>>>>>> `my_catalog.namespace1.table1`)
>>>>>>>>>>>>>>>>>>> doesn’t physically store the tables; it only routes 
>>>>>>>>>>>>>>>>>>> requests to underlying
>>>>>>>>>>>>>>>>>>> stores. Therefore, defaulting identifier resolution based 
>>>>>>>>>>>>>>>>>>> on the catalog
>>>>>>>>>>>>>>>>>>> where the view is "stored" doesn’t align with how catalogs 
>>>>>>>>>>>>>>>>>>> actually behave
>>>>>>>>>>>>>>>>>>> in practice.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>>> Walaa.
>>>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>> On Sun, Apr 20, 2025 at 11:17 PM Renjie Liu <
>>>>>>>>>>>>>>>>>>> liurenjie2...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>> Hi, Walaa:
>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>> Thanks for the proposal.
>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>> I've reviewed the doc, but in general I have
>>>>>>>>>>>>>>>>>>> some concerns with resolving catalog names on the fly with 
>>>>>>>>>>>>>>>>>>> query engine
>>>>>>>>>>>>>>>>>>> defined catalog names. This introduces some flexibility at 
>>>>>>>>>>>>>>>>>>> first glance,
>>>>>>>>>>>>>>>>>>> but also makes misconfiguration difficult to explain.
>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>> But I agree with one part that we should store
>>>>>>>>>>>>>>>>>>> resolved table uuid in view metadata, as table/view 
>>>>>>>>>>>>>>>>>>> renaming may introduce
>>>>>>>>>>>>>>>>>>> errors that's difficult to understand for user.
>>>>>>>>>>>>>>>>>>> >>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>> On Sat, Apr 19, 2025 at 3:02 AM Walaa Eldin
>>>>>>>>>>>>>>>>>>> Moustafa <wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> Looking forward to keeping up the momentum and
>>>>>>>>>>>>>>>>>>> closing out the MV spec as well. I’m hoping we can proceed 
>>>>>>>>>>>>>>>>>>> to a vote next
>>>>>>>>>>>>>>>>>>> week.
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> Here is a summary in case that helps. The
>>>>>>>>>>>>>>>>>>> proposal outlines a strategy for handling table identifiers 
>>>>>>>>>>>>>>>>>>> in Iceberg view
>>>>>>>>>>>>>>>>>>> metadata, with the goal of ensuring correctness, 
>>>>>>>>>>>>>>>>>>> portability, and engine
>>>>>>>>>>>>>>>>>>> compatibility. It recommends resolving table identifiers at 
>>>>>>>>>>>>>>>>>>> read time (late
>>>>>>>>>>>>>>>>>>> binding) rather than creation time, and introduces 
>>>>>>>>>>>>>>>>>>> UUID-based validation to
>>>>>>>>>>>>>>>>>>> maintain identity guarantees across engines, or sessions. 
>>>>>>>>>>>>>>>>>>> It also revises
>>>>>>>>>>>>>>>>>>> how default-catalog and default-namespace are handled 
>>>>>>>>>>>>>>>>>>> (defaulting both to
>>>>>>>>>>>>>>>>>>> the session context if not explicitly set) to better align 
>>>>>>>>>>>>>>>>>>> with engine
>>>>>>>>>>>>>>>>>>> behavior and improve cross-engine interoperability.
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> Please let me know your thoughts.
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>>>>> Walaa.
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>> On Wed, Apr 16, 2025 at 2:03 PM Walaa Eldin
>>>>>>>>>>>>>>>>>>> Moustafa <wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>> Thanks Eduard and Sung! I have addressed the
>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>> One key point to keep in mind is that catalog
>>>>>>>>>>>>>>>>>>> names in the spec refer to logical catalogs—i.e., the first 
>>>>>>>>>>>>>>>>>>> part of a
>>>>>>>>>>>>>>>>>>> three-part identifier. These correspond to Spark's 
>>>>>>>>>>>>>>>>>>> DataSourceV2 catalogs,
>>>>>>>>>>>>>>>>>>> Trino connectors, and similar constructs. This is a level 
>>>>>>>>>>>>>>>>>>> of abstraction
>>>>>>>>>>>>>>>>>>> above physical catalogs, which are not referenced or used 
>>>>>>>>>>>>>>>>>>> in the view spec.
>>>>>>>>>>>>>>>>>>> The reason is that table identifiers in the view 
>>>>>>>>>>>>>>>>>>> definition/text itself
>>>>>>>>>>>>>>>>>>> refer to logical catalogs, not physical ones (since they 
>>>>>>>>>>>>>>>>>>> interface directly
>>>>>>>>>>>>>>>>>>> with the engine and not a specific metastore).
>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>>>>>> Walaa.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>> On Wed, Apr 16, 2025 at 6:15 AM Sung Yun <
>>>>>>>>>>>>>>>>>>> sungwy...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Thank you Walaa for the proposal. I think
>>>>>>>>>>>>>>>>>>> view portability is a very important topic for us to 
>>>>>>>>>>>>>>>>>>> continue discussing as
>>>>>>>>>>>>>>>>>>> it relies on many assumptions within the data ecosystem for 
>>>>>>>>>>>>>>>>>>> it to function
>>>>>>>>>>>>>>>>>>> like you've highlighted well in the document.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>> I've added a few comments around how this may
>>>>>>>>>>>>>>>>>>> impact the permission questions the engines will be asking, 
>>>>>>>>>>>>>>>>>>> and whether
>>>>>>>>>>>>>>>>>>> that is the desired behavior.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>> Sung
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>> On Wed, Apr 16, 2025 at 7:32 AM Eduard
>>>>>>>>>>>>>>>>>>> Tudenhöfner <etudenhoef...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Thanks Walaa for tackling this problem. I've
>>>>>>>>>>>>>>>>>>> added a few comments to get a better understanding of how 
>>>>>>>>>>>>>>>>>>> this will look
>>>>>>>>>>>>>>>>>>> like in the actual implementation.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> Eduard
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 15, 2025 at 7:09 PM Walaa Eldin
>>>>>>>>>>>>>>>>>>> Moustafa <wa.moust...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Starting this thread to resume our
>>>>>>>>>>>>>>>>>>> discussion on how to reference table identifiers from 
>>>>>>>>>>>>>>>>>>> Iceberg metadata, a
>>>>>>>>>>>>>>>>>>> key aspect of the view specification, particularly in 
>>>>>>>>>>>>>>>>>>> relation to the MV
>>>>>>>>>>>>>>>>>>> (materialized view) extensions.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> I had the chance to speak offline with a
>>>>>>>>>>>>>>>>>>> few community members to better understand how the current 
>>>>>>>>>>>>>>>>>>> spec is being
>>>>>>>>>>>>>>>>>>> interpreted. Those conversations served as inputs to a new 
>>>>>>>>>>>>>>>>>>> proposal on how
>>>>>>>>>>>>>>>>>>> table identifier references could be represented in 
>>>>>>>>>>>>>>>>>>> metadata.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> You can find the proposal here [1]. I look
>>>>>>>>>>>>>>>>>>> forward to your feedback and working together to move this 
>>>>>>>>>>>>>>>>>>> forward so we
>>>>>>>>>>>>>>>>>>> can finalize the MV spec as well.
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1-I2v_OqBgJi_8HVaeH1u2jowghmXoB8XaJLzPBa_Hg8/edit?tab=t.0
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>> Walaa.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Reply via email to