Ah, I misunderstood that this applies only to views generated based on an FGAC policy. On this point I agree -- we probably don't want to change VIEW definitions, or at least we need to be very cautious about doing so. Let's not put this on the critical path for FGAC views.
--EM On Tue, May 20, 2025 at 5:07 PM Robert Stupp <sn...@snazy.de> wrote: > This proposal _does_ change the view definition - it returns a > _different_ representation than the one that has been stored before. > This is a change that breaks the contract of the specification and it > changes the observed behavior. > > FGAC is a very complex topic. The right way would be to have a holistic > design and agreed-on approach. That does take time. > > On 20.05.25 16:47, Eric Maynard wrote: > > I wouldn’t say that Polaris is changing a view definition, but per my > > understanding Polaris is actually generating a view based on a Policy. > > > > We will need a way for Polaris to embed some information into these > views. > > I don’t think this is a P0 to make FGAC work, and I don’t think this > > necessarily needs to take the form of a SQL function. For example, it > could > > be through a policy like: > > > > { > > “allow_columns”: [“a”, “b”], > > “transform_columns”: { > > { > > “col”: “c”, > > “predicate”: “some_func(x)”}, > > { > > “col”: “d”, > > “predicate: “${current_user} == admin” > > } > > } > > > > Perhaps we can try to get the first iteration of FGAC (with only field > > “allow_columns”) out first. Then, we can implement the engine-driven > > predicates (like that on column “c”). Finally, we can examine options for > > catalog-driven predicates like the one on column “d”. > > > > —EM > > > > On Tue, May 20, 2025 at 9:44 AM Robert Stupp <sn...@snazy.de> wrote: > > > >> I don't think that Polaris should change any view definition in any way, > >> but this is what the proposed approach does. > >> > >> The approach breaks the contract (behavior defined by the specification) > >> and in turn the observed behavior, absolute no go's. > >> > >> Practically speaking, the approach is blindly changing some string > >> without any knowledge about the actual meaning. But it has to know > >> exactly what it's doing - and to do that it has to know all the SQL > >> dialects. > >> > >> As a side note: every query engine already has information about the > >> user. I'm definitely not supporting exposing any authZ related > information. > >> > >> FGAC as a feature is a great thing to have. But the proposed approach is > >> not the right way. > >> > >> > >> On 19.05.25 20:33, Prashant Singh wrote: > >>> Hey Robert, > >>> > >>> Thank you for your honest feedbacks, please let me try answering your > >>> concerns : > >>> > >>>> There are tons of SQL dialects out there, each requires its own fully > >>> implemented lexer/parser/interpreter > >>> That's true and we are not interpreting it either, we are just > replacing > >>> the sql text wherever there is `is_principal('<principal_name>')` with > >> the > >>> value of TRUE and FALSE from the server end > >>> we are not re-interpreting or parsing the tree, i am assuming this is > >> what > >>> Analyzer already does in constant folding and boolean simplification, > but > >>> yes post parsing, but IMHO i don't think it's an impossible thing to > >>> achieve. If it helps i am even fine is wrapping this as > >>> `{{is_principal('<principal_name>')}}` to make this very specific and > let > >>> only Polaris work, IMHO we can work out the rough edges with the > >>> replacement. > >>> > >>>> for view containing view > >>> This should not be a problem, as we just replace the text of the > current > >>> view definition when it comes to resolve the nested view it will issue > >> the > >>> same call of LOAD view but with the nested view identifier, when it > will > >> be > >>> the call of nested view and that's when i we will do the replace > >>> we don't open the nested view in the definition during the loadView of > >> the > >>> parent, if that's the concern here, the nested view is treated > equivalent > >>> to any other identifier which is opened / interpreted at later state of > >>> execution. > >>> > >>>> Exposing authZ information via any kind of publicly accessible API to > >>> every user sounds like an interesting source of information - > especially > >>> for the "not so good and nice guys". > >>> > >>> Yes that's true and that's my intention it's just how we are delivering > >> the > >>> info, i.e i expose it by view definition itself (or by any other entity > >>> stored in Polaris) , but exposing this as an API would require engine > >> side > >>> integration too, which we as catalog have a very less control over as a > >>> catalog. > >>> > >>>> What's the benefit over having the ACLs on the table/view defined in > the > >>> intended way? > >>> > >>> It's more from feature parity perspective and giving more control on > view > >>> rather than just ACL (which are conjunctions) for ex if we just > >> complicate > >>> the view def with more predicated for ex disjunction > >>> > >>> select * from ns1.layer1_table where (condition1) OR > >>> (is_principal_role('ANALYST')) > >>> > >>> I would love to get your further feedback, considering the above. > >>> > >>> > >>> Best, > >>> Prashant Singh > >>> > >>> > >>> > >>> > >>> On Mon, May 19, 2025 at 11:04 AM Robert Stupp <sn...@snazy.de> wrote: > >>> > >>>> I'm brutally honest here: > >>>> > >>>> I think we should really stay away from interpreting SQL or any other > >>>> kind of (view) definition in Polaris. There are tons of SQL dialects > out > >>>> there, each requires its own fully implemented > lexer/parser/interpreter > >>>> - plus views-in-views-in-views-in-views... constructs requiring > >>>> resolution of nested views. It eventually ends in implementing > >>>> yet-another-query-engine. I doubt that this is doable with a > >>>> "java.lang.String.replace(from, to)" approach. > >>>> > >>>> Exposing authZ information via any kind of publicly accessible API to > >>>> every user sounds like an interesting source of information - > especially > >>>> for the "not so good and nice guys". > >>>> > >>>> Regarding the examples: what's the benefit over having the the ACLs on > >>>> the table/view defined in the intended way? > >>>> > >>>> On 19.05.25 19:26, Prashant Singh wrote: > >>>>> Hi everyone, > >>>>> > >>>>> I’d like to propose adding *context-aware functions* to Apache > Polaris > >> so > >>>>> that view definitions can resolve security context on the Polaris > side > >>>> (aka > >>>>> catalog end without depending on engines). > >>>>> > >>>>> *Proposed functions* > >>>>> > >>>>> 1. > >>>>> > >>>>> *is_principal('<principal_name>')* – returns TRUE if the > >>>> authenticated > >>>>> principal matches <principal_name>, otherwise FALSE. > >>>>> 2. > >>>>> > >>>>> *is_principal_role('<principal_role_name>')* – returns TRUE > when > >>>>> <principal_role_name> appears in the principal’s role set. > >>>>> 3. > >>>>> > >>>>> *is_catalog_role('<catalog_role_name>')* – analogous check at > the > >>>>> catalog-role level. > >>>>> > >>>>> *Why it matters* > >>>>> > >>>>> These predicates make views dynamic. Example: > >>>>> > >>>>> CREATE VIEW dynamic_vw ASSELECT *FROM ns1.layer1_tableWHERE > >>>>> is_principal_role('ANALYST'); > >>>>> > >>>>> When a user whose one of principal roles include *ANALYST* calls LOAD > >>>>> VIEW, Polaris rewrites the view to > >>>>> > >>>>> > >>>>> - > >>>>> > >>>>> SELECT * FROM ns1.layer1_table WHERE TRUE; > >>>>> > >>>>> > >>>>> For everyone else the view becomes > >>>>> > >>>>> - > >>>>> > >>>>> SELECT * FROM ns1.layer1_table WHERE FALSE; > >>>>> > >>>>> > >>>>> The result is better and consistent control of the identity > resolution > >>>>> without relying on the engine side changes and giving polaris more > >>>>> authority in enforcing things like FGAC (WIP by me). > >>>>> Note the same can be extrapolated to any Polaris stored entity. > >>>>> > >>>>> *Proof of concept* > >>>>> > >>>>> I’ve put together a quick POC branch: > >>>>> > >> > https://github.com/apache/polaris/compare/main...singhpk234:polaris:dyanmic/view > >>>>> *Prior art* > >>>>> > >>>>> Snowflake context functions : > >>>>> https://docs.snowflake.com/en/sql-reference/functions-context > >>>>> <https://docs.snowflake.com/en/sql-reference/functions-context> > >>>>> Databricks Unity Catalog offers a similar mechanism called *dynamic > >>>> views*: > >>>>> https://docs.databricks.com/aws/en/views/dynamic > >>>>> > >>>>> *Next steps* > >>>>> > >>>>> If the community is interested, we can discuss API surface, engine > >>>>> implications, and a roadmap for merging. > >>>>> > >>>>> Eager to hear your feedback! > >>>>> > >>>>> Best, > >>>>> Prashant Singh > >>>>> > >>>> -- > >>>> Robert Stupp > >>>> @snazy > >>>> > >>>> > >> -- > >> Robert Stupp > >> @snazy > >> > >> > -- > Robert Stupp > @snazy > >