Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-08 Thread Steven Wu
If a table referenced by the view SQL query is deleted and recreated, is the regular view still valid? If yes, it won't be correct to use resolved UUID to validate the name reference. On the materialized view side, we are aligned that UUID is necessary to detect this scenario and mark the refresh s

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-08 Thread Walaa Eldin Moustafa
Hi Dan, Thanks for the clarification! I agree that UUIDs should not be stored in the SQL definition of the view. Just to clarify: the proposal isn't about modifying the view definition itself, but rather about how table identifiers are represented in Iceberg view metadata, which naturally depends

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-08 Thread Daniel Weeks
I don't think we want to include the resolved table UUIDs in the view definition, but rather in the storage table state. You can still resolve whether those drift at some point, but I don't feel like it's a good idea to capture data in the view that we may allow to drift if there isn't any require

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-07 Thread Walaa Eldin Moustafa
Thanks Steven! So would you agree that resolution using default-catalog and default-namespace does not provide full determinism, and requires a supporting safety mechanism? Thanks, Walaa. On Wed, May 7, 2025 at 10:30 PM Steven Wu wrote: > > If the current model is considered deterministic, do y

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-07 Thread Steven Wu
> If the current model is considered deterministic, do you think `default-catalog` and `default-namespace` fields provide enough determinism to eliminate the need for UUIDs when storing table identifiers? I am fine with storing UUIDs for table identifiers in the view. Basically, view creation reso

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-07 Thread Walaa Eldin Moustafa
Hi Steven, Thanks for the reply. > I agree with Dan that we shouldn't solve catalog naming in the Iceberg view spec. To clarify, I don't believe the proposal is trying to solve catalog naming. What it’s doing is simply this: * Proposing that table names inside views resolve the same way as they

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-07 Thread Steven Wu
I agree with Dan that we shouldn't solve catalog naming in the Iceberg view spec. I am not convinced that the proposed change will make the table identifier resolution more clear and portable. The recommendation of using engines' current catalog and database can cause context dependent resolution r

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-05-06 Thread Benny Chow
In Spark, I believe that the USE commands sets the current catalog and namespace. This affects both where the view is created and how unqualified table identifiers are resolved. I also don't see an issue with saving the current catalog and namespace into the view metadata's default-catalog and de

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-30 Thread Walaa Eldin Moustafa
> I think that's the lesser evil compared to Iceberg specifying how engines should resolve identifiers I think this is also similar to the previous point. It is the other way around. Right now the spec dictates how to resolve (through employing a view-specific `default-catalog` field). The proposa

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-30 Thread Walaa Eldin Moustafa
> I thought "default-catalog" could be set via the USE command. Benny, I think this is a misconception or miscommunication. The USE command has no impact on the `default-catalog` field. In fact, the proposal's direction is exactly to establish that USE command should influence how tables are resol

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-30 Thread Benny Chow
> there is no SQL construct today to explicitly set default-catalog I thought "default-catalog" could be set via the USE command. I generally agree with Dan about requiring consistent catalog names. I think that's the lesser evil compared to Iceberg specifying how engines should resolve identifi

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-29 Thread Walaa Eldin Moustafa
Hi Rishabh, You're right that the proposal touches on two aspects, and resolution rules are one of them. The other aspect is the proposal's position that table identifiers should be stored in metadata exactly as they appear in the view text (e.g., even if they're two-part or partially qualified),

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-29 Thread Rishabh Bhatia
Hello Walaa, Thanks for starting this discussion. I think we should decouple at least the MV Spec from the proposal to change the current behavior of view resolution. We can continue having the discussion if the current view spec needs to be changed or not. Based on the decision at a later point

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Walaa Eldin Moustafa
Correction of typo: both engines seem to set default-catalog to the view catalog if it is defined, or to null if the view catalog is not defined. On Mon, Apr 28, 2025 at 3:06 PM Walaa Eldin Moustafa wrote: > Hi Dan, > > Thanks again for your response. > > I agree that catalog renaming is an envi

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Walaa Eldin Moustafa
Hi Dan, Thanks again for your response. I agree that catalog renaming is an environmental event, but it's a real one that happens frequently in practice. Saying that the Iceberg spec cannot accommodate something as common as catalog renaming feels very restrictive, and could make the spec less pr

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Daniel Weeks
Walaa, > tables inside views remain reachable after a catalog rename This problem stems from the exact environmental/configuration issue that we should not be trying to address. I don't think we would expect references to survive a catalog rename. That's not something covered by the spec and ne

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Walaa Eldin Moustafa
Hi Dan, Thanks for chiming in. I believe the issues we’re seeing now go beyond just catalog naming consistency. The behavior around default-catalog itself introduces resolution inconsistencies even when catalog names are consistent. For example: * When default-catalog is set to null, tables insi

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Daniel Weeks
I would agree with Jan's summary of why 'default-catalog' was introduced, but I think we need to step back and align on what we are really attempting to support in the spec. The issues we're discussing largely stem from using multiple engines with cross catalog references and configurations where

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-28 Thread Jan Kaul
I think the intention with the "default-catalog" was that every query engine uses it to store its session default catalog at the time of creating the view. This way the view could be reused in another session. The idea was not to introduce an additional SQL syntax to set the default-catalog.

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Walaa Eldin Moustafa
To help folks catch up on the latest discussions and interpretation of the spec, I have summarized everything we discussed so far at the top of the proposal document (here ). I have slightly updated the pr

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Benny Chow
I'd like to contribute my opinions on this: - I don't particularly like the current behavior of "default to the view's catalog when default-catalog is not set". Fundamentally, I believe the intent of default-catalog and default-namespace is there to help users write more concise SQL. - spark sess

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Walaa Eldin Moustafa
Thanks for the contribution Benny! +1 to the confusion the fallback creates. Also just to be clear, at this point and after clarifying the current spec intentions, I am convinced that we should remove the default catalog and default namespace fields altogether. Thanks, Walaa. On Fri, Apr 25, 2025

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Walaa Eldin Moustafa
Thanks Steven! How do you recommend making Spark implementation conform to the spec? Do we need Spark SQL extensions and/or Spark catalog APIs for that? How do you recommend reconciling the inconsistencies I shared regarding many resolution methods not consistently being followed in different scen

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Steven Wu
The core issue is on the fall back behavior when `default-catalog` is not defined. Current view spec says the fallback should be the catalog where the view is defined. It doesn't really matter what the catalog is named (catalogX) by the read engine. - If a view refers to the tables in the same cata

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Walaa Eldin Moustafa
Hi Jan, Thanks again for continuing the discussion. I want to highlight a few fundamental issues around the interpretation of default-catalog: Here is the real catch: * default-catalog cannot logically be defined at view creation time. It would be circular: the view needs to exist before its met

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-25 Thread Jan Kaul
@Walaa: I would argue that when you run a CREATE VIEW statement the query engine knowns which catalog the view is being created in. So even though we typically use late binding to resolve the view catalog at query time, it can also be used at creation time. The query engine would need to kee

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-24 Thread Manu Zhang
> > For example, if we want to validate that the tables referenced in the view > exist, how can we do that when default-catalog isn't defined, since the > view hasn't been created or loaded yet? I don't think this is related to view spec. How do we validate that a table exists without a default ca

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-24 Thread Walaa Eldin Moustafa
Hi Jan, I think we still share the same understanding. Just to clarify: when I referred to late binding as “similar” to the proposal, I was acknowledging the distinction between view-level and table-level resolution. But as you noted, both follow a late binding model. That said, this still raises

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-24 Thread Jan Kaul
Yes, I have the same understanding. The view catalog is resolved at query time. As you mentioned before, it's good to distinguish between the physical catalog and it's reference used in SQL statements. The important part is that the physical catalog of the view and the tables referenced in it'

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-23 Thread Walaa Eldin Moustafa
Hi Jan, > The view is executed when it's being referenced in a SQL statement. That statement contains the information for the query engine to resolve the catalog of the view. If I’m understanding correctly, that means: * If the view is queried as SELECT * FROM catalogA.namespace.view, then catal

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-22 Thread Jan Kaul
Hi Walaa, Thanks for clarifying the aspects of non-determinism. Let me try to address your questions. 1. This is my interpretation of the current spec: The view is executed when it's being referenced in a SQL statement. That statement contains the information for the query engine to resolve

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-22 Thread Walaa Eldin Moustafa
Hi Jan, Thanks for the thoughtful feedback. I think it’s important we clarify a key point before going deeper: Non-determinism is not caused by session fallback behavior—it’s a *fundamental limitation of using table identifiers* alone, regardless of whether we use the current rule, the proposed

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-22 Thread Jan Kaul
Hi Walaa, thank you for your proposal. If I understood correctly, you proposal is composed of three parts: - session default catalog as fallback for "default-catalog" - session default namespace as fallback for "default-namepace" - Late binding + UUID validation I have some comments regardi

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-21 Thread Walaa Eldin Moustafa
Thanks Renjie! The existing spec has some guidance on resolving catalogs on the fly already (to address the case of view text with table identifiers missing the catalog part). The guidance is to use the catalog where the view is stored. But I find this rule hard to interpret or use. The catalog it

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-20 Thread Renjie Liu
Hi, Walaa: Thanks for the proposal. I've reviewed the doc, but in general I have some concerns with resolving catalog names on the fly with query engine defined catalog names. This introduces some flexibility at first glance, but also makes misconfiguration difficult to explain. But I agree with

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-18 Thread Walaa Eldin Moustafa
Hi Everyone, Looking forward to keeping up the momentum and closing out the MV spec as well. I’m hoping we can proceed to a vote next week. Here is a summary in case that helps. The proposal outlines a strategy for handling table identifiers in Iceberg view metadata, with the goal of ensuring cor

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-16 Thread Walaa Eldin Moustafa
Thanks Eduard and Sung! I have addressed the comments. One key point to keep in mind is that catalog names in the spec refer to logical catalogs—i.e., the first part of a three-part identifier. These correspond to Spark's DataSourceV2 catalogs, Trino connectors, and similar constructs. This is a l

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-16 Thread Sung Yun
Thank you Walaa for the proposal. I think view portability is a very important topic for us to continue discussing as it relies on many assumptions within the data ecosystem for it to function like you've highlighted well in the document. I've added a few comments around how this may impact the pe

Re: [DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-16 Thread Eduard Tudenhöfner
Thanks Walaa for tackling this problem. I've added a few comments to get a better understanding of how this will look like in the actual implementation. Eduard On Tue, Apr 15, 2025 at 7:09 PM Walaa Eldin Moustafa wrote: > Hi Everyone, > > Starting this thread to resume our discussion on how to

[DISCUSS] Table Identifiers in Iceberg View Spec

2025-04-15 Thread Walaa Eldin Moustafa
Hi Everyone, Starting this thread to resume our discussion on how to reference table identifiers from Iceberg metadata, a key aspect of the view specification, particularly in relation to the MV (materialized view) extensions. I had the chance to speak offline with a few community members to bett