Re: Iceberg View Spec Improvements

Walaa Eldin Moustafa Sun, 20 Oct 2024 22:29:23 -0700

Hi Everyone,

Thanks for all the discussion so far. I have created a PR to document the
requirements https://github.com/apache/iceberg/pull/11365. Please feel free
to review it or discuss further in the thread.


Thanks,
Walaa.


On Fri, Oct 11, 2024 at 5:19 PM Walaa Eldin Moustafa <wa.moust...@gmail.com>
wrote:

> Hi Benny,
>
> > we don't need to list out such restrictions because they really depend
> on the setup
>
> I do not think this is correct. The restrictions do not depend on the
> setup. They rather dictate it. All restrictions discussed in this thread do
> that one way or the other.
>
> The single engine (Dremio) example does not apply to this discussion. The
> spec is clear if a single engine is in use, but the spec is not limited for
> single engine use cases.
>
> > If Dremio intended for that view to be readable by Spark, it would have
> to adhere to all those restrictions I listed before.
>
> Sure, but those restrictions are only stated in the mailing list (in many
> forms). We are discussing if we should add them to the spec (in one form).
>
> Thanks
> Walaa.
>
>
> On Fri, Oct 11, 2024 at 4:56 PM Benny Chow <btc...@gmail.com> wrote:
>
>> Hi Russell
>>
>> Yes, you listed out the requirements to make the two Spark engines case
>> work.  Basically, it allows each engine to dynamically resolve the table
>> identifiers under the correct catalog name.
>>
>> Hello Walla
>>
>> IMO, we don't need to list out such restrictions because they really
>> depend on the setup.   Multiple Iceberg catalogs?  Multiple engines?
>> Consistent catalog names?  Are views created with USE in context?  Today,
>> in Dremio, we save tons of views to Nessie with fully qualified SQL
>> identifiers to other sources such as mysql or snowflake.  Those views may
>> or may not have default-catalog and default-namespaces set depending on the
>> USE context.  If Dremio intended for that view to be readable by Spark, it
>> would have to adhere to all those restrictions I listed before.
>>
>> Thanks
>> Benny
>>
>> On Fri, Oct 11, 2024 at 10:00 AM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> Benny, "Iceberg View Spec Improvements" includes documenting what is
>>> supported and what is not. You listed a few restrictions. Many of them are
>>> not documented on the current spec. Documenting them is what this thread is
>>> about. We are trying to reach a consensus on the necessary constraints (so
>>> we are not over- or under-restricting).
>>>
>>> Russell, I think what you stated is a version of the restrictions. From
>>> my point of view, the list of the necessary restrictions are:
>>>
>>> * Engines must share the same default catalog names, ensuring that
>>> partially specified SQL identifiers with catalog omitted are resolved to
>>> the same fully specified SQL identifier across all engines.
>>> * Engines must share the same default namespaces, ensuring that SQL
>>> identifiers without catalog and namespace are resolved to the same fully
>>> specified SQL identifier across all engines.
>>> * All engines must resolve a fully specified SQL identifier to the same
>>> storage table in the same storage catalog.
>>>
>>> Please let me know if this aligns with what you stated.
>>>
>>> Thanks,
>>> Walaa.
>>>
>>>

Re: Iceberg View Spec Improvements

Reply via email to