Having spent some time testing Nessie views with multiple engines (Dremio +
Spark) using different catalog names and different namespaces, I tend to
agree with Dan and Amogh that the current view spec is fine.  Unlike
tables, I think when it comes to views, engines have to "work together" if
they expect to share the views.  Working together means:

Providing multiple SQL representations
Not using engine specific operators or UDFs
Not using engine specific row column access policies
Not using engine specific role based access control features such as view
delegation (ex. query user vs view owner)
Not using fully qualified SQL identifiers when engines don't standardize on
catalog names
Using standardized catalog names if cross catalog joins are needed in view
SQLs

Some of the above limitations also exist even for the same engine when you
have for example two Spark clusters pointing to the same catalog and each
cluster uses different catalog names.

Best
Benny


On Thu, Oct 10, 2024 at 8:51 PM Amogh Jahagirdar <2am...@gmail.com> wrote:

>  I took another pass over the view spec and I believe that representations
> of identifiers and how resolution of references by engines should be
> performed is clear. So from my perspective, at the moment we do not need to
> change the view spec itself.
>
> I do acknowledge though that practically there can be scenarios where
> catalog names are inconsistent across environments and this has led to
> confusion when developing the MV spec (I'm remembering based on last week's
> community sync). There are some recommendations so that implementations can
> address these inconsistencies in this thread already, but I don't think
> adding some more complexity to the view spec via some form of
> normalizing/mapping identifiers is worth it for these cases. I think in its
> current state it's a sufficient model for developing MVs, and shouldn't
> block progression on that.
>
> I'm +1 on adding an "unsupported configurations" clarification though,
> it's become clear to me that there's enough confusion around the
> implications of the SQL identifiers in the spec that it's worth calling it
> out.
>
> Thanks,
>
> Amogh Jahagirdar
>
> On Thu, Oct 10, 2024 at 5:08 PM Daniel Weeks <dwe...@apache.org> wrote:
>
>> Russell,
>>
>> I think there are a few existing ways to support that.  For example, if
>> you exclude the default catalog and fully reference the table with
>> <catalog>.<db>.<table> most sql engines will interpret that correctly (for
>> cross or known catalogs).  Also, if you omit the catalog and use a just
>> <db>.<table>, it must use the catalog in which the view is defined (per the
>> spec), which I think addresses your case.
>>
>> Server-side rewrite is possible, but I think we'd need to explore the
>> specific cases, which we'll probably need to do as we consider secure views
>> more closely.
>>
>> -Dan
>>
>> On Thu, Oct 10, 2024 at 3:59 PM Walaa Eldin Moustafa <
>> wa.moust...@gmail.com> wrote:
>>
>>> Hi Russel,
>>>
>>> Would this be a good candidate for a future version of the spec?
>>>
>>> Thanks,
>>> Walaa.
>>>
>>>
>>> On Thu, Oct 10, 2024 at 3:50 PM Russell Spitzer <
>>> russell.spit...@gmail.com> wrote:
>>>
>>>> I still have an issue with representations not having explicit ways of
>>>> incorporating the catalog name, I'm thinking about our potential future
>>>> situation where we want to return a view for Fine Grained Access policies.
>>>> In that case won't the Catalog need to craft a representation that matches
>>>> the configuration of the engine? Doesn't this mean the client will have to
>>>> tell the Catalog what its local name is?
>>>>
>>>> On Thu, Oct 10, 2024 at 5:34 PM Daniel Weeks <dwe...@apache.org> wrote:
>>>>
>>>>> Hey Walaa,
>>>>>
>>>>> I recognize the issue you're calling out but disagree there is an
>>>>> implicit assumption in the spec.  The spec clearly says how identifiers
>>>>> including catalogs and namespaces are represented/stored and how 
>>>>> references
>>>>> need to be resolved.  The idea that a catalog may not match is an
>>>>> environmental/infrastructure/configuration issue related to where they are
>>>>> being referenced from.
>>>>>
>>>>> If we think this is sufficiently confusing to people, I would be open
>>>>> to discussing an "unsupported configurations" callout, but I don't think
>>>>> this blocks work and am somewhat skeptical that it's necessary.
>>>>>
>>>>> -Dan
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Oct 10, 2024 at 2:47 PM Walaa Eldin Moustafa <
>>>>> wa.moust...@gmail.com> wrote:
>>>>>
>>>>>> Hi Dan,
>>>>>>
>>>>>> I think there are a few questions that we should solve to decide the
>>>>>> path forward:
>>>>>>
>>>>>> ** Does the current spec contain implicit assumptions?*
>>>>>> I think the answer is yes. I think this is also what Ryan indicated
>>>>>> here [1].
>>>>>>
>>>>>> ** Do these implicit assumptions make it difficult to adopt the spec
>>>>>> or evolve it in the correct way?*
>>>>>> I think the answer is yes as well. MV design discussions became quite
>>>>>> complicated because most contributors had a different understanding of 
>>>>>> the
>>>>>> spec compared to what it encodes as implicit assumptions (see this thread
>>>>>> for an example [2] -- there are a few more). This unaligned understanding
>>>>>> could possibly lead to inaccurate designs and potentially result in
>>>>>> unneeded further constraints or unneeded engineering complexity.
>>>>>>
>>>>>> ** What are the implicit assumptions (in an ambiguous way)?*
>>>>>> I do not think the answer is clear to everyone, even at this point.
>>>>>> There have been a few variations of those assumptions in this thread 
>>>>>> alone.
>>>>>> I think we should converge on a clear set of assumptions for everyone's
>>>>>> consumption.
>>>>>>
>>>>>> ** Should we add the assumptions explicitly to the spec?*
>>>>>> I think we definitely should. Adoption or extension of the spec will
>>>>>> be quite difficult if the assumptions are not clearly stated and are
>>>>>> interpreted differently by different contributors.
>>>>>>
>>>>>> Would be great to hear the community's feedback on whether they agree
>>>>>> with the answers to the above questions.
>>>>>>
>>>>>> [1] https://lists.apache.org/thread/s1hjnc163ny76smv2l0t2sxxn93s4595
>>>>>> [2] https://lists.apache.org/thread/0wzowd15328rnwvotzcoo4jrdyrzlx91
>>>>>>
>>>>>> Thanks,
>>>>>> Walaa.
>>>>>>
>>>>>

Reply via email to