Hi Russell

Yes, you listed out the requirements to make the two Spark engines case
work.  Basically, it allows each engine to dynamically resolve the table
identifiers under the correct catalog name.

Hello Walla

IMO, we don't need to list out such restrictions because they really depend
on the setup.   Multiple Iceberg catalogs?  Multiple engines?  Consistent
catalog names?  Are views created with USE in context?  Today, in Dremio,
we save tons of views to Nessie with fully qualified SQL identifiers to
other sources such as mysql or snowflake.  Those views may or may not have
default-catalog and default-namespaces set depending on the USE context.
If Dremio intended for that view to be readable by Spark, it would have to
adhere to all those restrictions I listed before.

Thanks
Benny

On Fri, Oct 11, 2024 at 10:00 AM Walaa Eldin Moustafa <wa.moust...@gmail.com>
wrote:

> Benny, "Iceberg View Spec Improvements" includes documenting what is
> supported and what is not. You listed a few restrictions. Many of them are
> not documented on the current spec. Documenting them is what this thread is
> about. We are trying to reach a consensus on the necessary constraints (so
> we are not over- or under-restricting).
>
> Russell, I think what you stated is a version of the restrictions. From my
> point of view, the list of the necessary restrictions are:
>
> * Engines must share the same default catalog names, ensuring that
> partially specified SQL identifiers with catalog omitted are resolved to
> the same fully specified SQL identifier across all engines.
> * Engines must share the same default namespaces, ensuring that SQL
> identifiers without catalog and namespace are resolved to the same fully
> specified SQL identifier across all engines.
> * All engines must resolve a fully specified SQL identifier to the same
> storage table in the same storage catalog.
>
> Please let me know if this aligns with what you stated.
>
> Thanks,
> Walaa.
>
>

Reply via email to