Ah, I misunderstood that this applies only to views generated based on an
FGAC policy. On this point I agree -- we probably don't want to change VIEW
definitions, or at least we need to be very cautious about doing so. Let's
not put this on the critical path for FGAC views.

--EM

On Tue, May 20, 2025 at 5:07 PM Robert Stupp <sn...@snazy.de> wrote:

> This proposal _does_ change the view definition - it returns a
> _different_ representation than the one that has been stored before.
> This is a change that breaks the contract of the specification and it
> changes the observed behavior.
>
> FGAC is a very complex topic. The right way would be to have a holistic
> design and agreed-on approach. That does take time.
>
> On 20.05.25 16:47, Eric Maynard wrote:
> > I wouldn’t say that Polaris is changing a view definition, but per my
> > understanding Polaris is actually generating a view based on a Policy.
> >
> > We will need a way for Polaris to embed some information into these
> views.
> > I don’t think this is a P0 to make FGAC work, and I don’t think this
> > necessarily needs to take the form of a SQL function. For example, it
> could
> > be through a policy like:
> >
> > {
> >    “allow_columns”: [“a”, “b”],
> >    “transform_columns”: {
> >      {
> >        “col”: “c”,
> >        “predicate”: “some_func(x)”},
> >      {
> >        “col”: “d”,
> >        “predicate: “${current_user} == admin”
> >    }
> > }
> >
> > Perhaps we can try to get the first iteration of FGAC (with only field
> > “allow_columns”) out first. Then, we can implement the engine-driven
> > predicates (like that on column “c”). Finally, we can examine options for
> > catalog-driven predicates like the one on column “d”.
> >
> > —EM
> >
> > On Tue, May 20, 2025 at 9:44 AM Robert Stupp <sn...@snazy.de> wrote:
> >
> >> I don't think that Polaris should change any view definition in any way,
> >> but this is what the proposed approach does.
> >>
> >> The approach breaks the contract (behavior defined by the specification)
> >> and in turn the observed behavior, absolute no go's.
> >>
> >> Practically speaking, the approach is blindly changing some string
> >> without any knowledge about the actual meaning. But it has to know
> >> exactly what it's doing - and to do that it has to know all the SQL
> >> dialects.
> >>
> >> As a side note: every query engine already has information about the
> >> user. I'm definitely not supporting exposing any authZ related
> information.
> >>
> >> FGAC as a feature is a great thing to have. But the proposed approach is
> >> not the right way.
> >>
> >>
> >> On 19.05.25 20:33, Prashant Singh wrote:
> >>> Hey Robert,
> >>>
> >>> Thank you for your honest feedbacks, please let me try answering your
> >>> concerns :
> >>>
> >>>> There are tons of SQL dialects out there, each requires its own fully
> >>> implemented lexer/parser/interpreter
> >>> That's true and we are not interpreting it either, we are just
> replacing
> >>> the sql text wherever there is `is_principal('<principal_name>')` with
> >> the
> >>> value of TRUE and FALSE from the server end
> >>> we are not re-interpreting or parsing the tree, i am assuming this is
> >> what
> >>> Analyzer already does in constant folding and boolean simplification,
> but
> >>> yes post parsing, but IMHO i don't think it's an impossible thing to
> >>> achieve. If it helps i am even fine is wrapping this as
> >>> `{{is_principal('<principal_name>')}}` to make this very specific and
> let
> >>> only Polaris work, IMHO we can work out the rough edges with the
> >>> replacement.
> >>>
> >>>> for view containing view
> >>> This should not be a problem, as we just replace the text of the
> current
> >>> view definition when it comes to resolve the nested view it will issue
> >> the
> >>> same call of LOAD view but with the nested view identifier, when it
> will
> >> be
> >>> the call of nested view and that's when i we will do the replace
> >>> we don't open the nested view in the definition during the loadView of
> >> the
> >>> parent, if that's the concern here, the nested view is treated
> equivalent
> >>> to any other identifier which is opened / interpreted at later state of
> >>> execution.
> >>>
> >>>> Exposing authZ information via any kind of publicly accessible API to
> >>> every user sounds like an interesting source of information -
> especially
> >>> for the "not so good and nice guys".
> >>>
> >>> Yes that's true and that's my intention it's just how we are delivering
> >> the
> >>> info, i.e i expose it by view definition itself (or by any other entity
> >>> stored in Polaris) , but exposing this as an API would require engine
> >> side
> >>> integration too, which we as catalog have a very less control over as a
> >>> catalog.
> >>>
> >>>> What's the benefit over having the ACLs on the table/view defined in
> the
> >>> intended way?
> >>>
> >>> It's more from feature parity perspective and giving more control on
> view
> >>> rather than just ACL (which are conjunctions) for ex if we just
> >> complicate
> >>> the view def with more predicated for ex disjunction
> >>>
> >>> select * from ns1.layer1_table where (condition1) OR
> >>> (is_principal_role('ANALYST'))
> >>>
> >>> I would love to get your further feedback, considering the above.
> >>>
> >>>
> >>> Best,
> >>> Prashant Singh
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, May 19, 2025 at 11:04 AM Robert Stupp <sn...@snazy.de> wrote:
> >>>
> >>>> I'm brutally honest here:
> >>>>
> >>>> I think we should really stay away from interpreting SQL or any other
> >>>> kind of (view) definition in Polaris. There are tons of SQL dialects
> out
> >>>> there, each requires its own fully implemented
> lexer/parser/interpreter
> >>>> - plus views-in-views-in-views-in-views... constructs requiring
> >>>> resolution of nested views. It eventually ends in implementing
> >>>> yet-another-query-engine. I doubt that this is doable with a
> >>>> "java.lang.String.replace(from, to)" approach.
> >>>>
> >>>> Exposing authZ information via any kind of publicly accessible API to
> >>>> every user sounds like an interesting source of information -
> especially
> >>>> for the "not so good and nice guys".
> >>>>
> >>>> Regarding the examples: what's the benefit over having the the ACLs on
> >>>> the table/view defined in the intended way?
> >>>>
> >>>> On 19.05.25 19:26, Prashant Singh wrote:
> >>>>> Hi everyone,
> >>>>>
> >>>>> I’d like to propose adding *context-aware functions* to Apache
> Polaris
> >> so
> >>>>> that view definitions can resolve security context on the Polaris
> side
> >>>> (aka
> >>>>> catalog end without depending on engines).
> >>>>>
> >>>>> *Proposed functions*
> >>>>>
> >>>>>       1.
> >>>>>
> >>>>>       *is_principal('<principal_name>')* – returns TRUE if the
> >>>> authenticated
> >>>>>       principal matches <principal_name>, otherwise FALSE.
> >>>>>       2.
> >>>>>
> >>>>>       *is_principal_role('<principal_role_name>')* – returns TRUE
> when
> >>>>>       <principal_role_name> appears in the principal’s role set.
> >>>>>       3.
> >>>>>
> >>>>>       *is_catalog_role('<catalog_role_name>')* – analogous check at
> the
> >>>>>       catalog-role level.
> >>>>>
> >>>>> *Why it matters*
> >>>>>
> >>>>> These predicates make views dynamic. Example:
> >>>>>
> >>>>> CREATE VIEW dynamic_vw ASSELECT *FROM ns1.layer1_tableWHERE
> >>>>> is_principal_role('ANALYST');
> >>>>>
> >>>>> When a user whose one of principal roles include *ANALYST* calls LOAD
> >>>>> VIEW, Polaris rewrites the view to
> >>>>>
> >>>>>
> >>>>>       -
> >>>>>
> >>>>>       SELECT * FROM ns1.layer1_table WHERE TRUE;
> >>>>>
> >>>>>
> >>>>> For everyone else the view becomes
> >>>>>
> >>>>>       -
> >>>>>
> >>>>>       SELECT * FROM ns1.layer1_table WHERE FALSE;
> >>>>>
> >>>>>
> >>>>> The result is better and consistent control of the identity
> resolution
> >>>>> without relying on the engine side changes and giving polaris more
> >>>>> authority in enforcing things like FGAC (WIP by me).
> >>>>> Note the same can be extrapolated to any Polaris stored entity.
> >>>>>
> >>>>> *Proof of concept*
> >>>>>
> >>>>> I’ve put together a quick POC branch:
> >>>>>
> >>
> https://github.com/apache/polaris/compare/main...singhpk234:polaris:dyanmic/view
> >>>>> *Prior art*
> >>>>>
> >>>>> Snowflake context functions :
> >>>>>     https://docs.snowflake.com/en/sql-reference/functions-context
> >>>>> <https://docs.snowflake.com/en/sql-reference/functions-context>
> >>>>> Databricks Unity Catalog offers a similar mechanism called *dynamic
> >>>> views*:
> >>>>> https://docs.databricks.com/aws/en/views/dynamic
> >>>>>
> >>>>> *Next steps*
> >>>>>
> >>>>> If the community is interested, we can discuss API surface, engine
> >>>>> implications, and a roadmap for merging.
> >>>>>
> >>>>> Eager to hear your feedback!
> >>>>>
> >>>>> Best,
> >>>>> Prashant Singh
> >>>>>
> >>>> --
> >>>> Robert Stupp
> >>>> @snazy
> >>>>
> >>>>
> >> --
> >> Robert Stupp
> >> @snazy
> >>
> >>
> --
> Robert Stupp
> @snazy
>
>

Reply via email to