Hi Benny,

> we don't need to list out such restrictions because they really depend on
the setup

I do not think this is correct. The restrictions do not depend on the
setup. They rather dictate it. All restrictions discussed in this thread do
that one way or the other.

The single engine (Dremio) example does not apply to this discussion. The
spec is clear if a single engine is in use, but the spec is not limited for
single engine use cases.

> If Dremio intended for that view to be readable by Spark, it would have
to adhere to all those restrictions I listed before.

Sure, but those restrictions are only stated in the mailing list (in many
forms). We are discussing if we should add them to the spec (in one form).

Thanks
Walaa.


On Fri, Oct 11, 2024 at 4:56 PM Benny Chow <btc...@gmail.com> wrote:

> Hi Russell
>
> Yes, you listed out the requirements to make the two Spark engines case
> work.  Basically, it allows each engine to dynamically resolve the table
> identifiers under the correct catalog name.
>
> Hello Walla
>
> IMO, we don't need to list out such restrictions because they really
> depend on the setup.   Multiple Iceberg catalogs?  Multiple engines?
> Consistent catalog names?  Are views created with USE in context?  Today,
> in Dremio, we save tons of views to Nessie with fully qualified SQL
> identifiers to other sources such as mysql or snowflake.  Those views may
> or may not have default-catalog and default-namespaces set depending on the
> USE context.  If Dremio intended for that view to be readable by Spark, it
> would have to adhere to all those restrictions I listed before.
>
> Thanks
> Benny
>
> On Fri, Oct 11, 2024 at 10:00 AM Walaa Eldin Moustafa <
> wa.moust...@gmail.com> wrote:
>
>> Benny, "Iceberg View Spec Improvements" includes documenting what is
>> supported and what is not. You listed a few restrictions. Many of them are
>> not documented on the current spec. Documenting them is what this thread is
>> about. We are trying to reach a consensus on the necessary constraints (so
>> we are not over- or under-restricting).
>>
>> Russell, I think what you stated is a version of the restrictions. From
>> my point of view, the list of the necessary restrictions are:
>>
>> * Engines must share the same default catalog names, ensuring that
>> partially specified SQL identifiers with catalog omitted are resolved to
>> the same fully specified SQL identifier across all engines.
>> * Engines must share the same default namespaces, ensuring that SQL
>> identifiers without catalog and namespace are resolved to the same fully
>> specified SQL identifier across all engines.
>> * All engines must resolve a fully specified SQL identifier to the same
>> storage table in the same storage catalog.
>>
>> Please let me know if this aligns with what you stated.
>>
>> Thanks,
>> Walaa.
>>
>>

Reply via email to