I wouldn’t say that Polaris is changing a view definition, but per my understanding Polaris is actually generating a view based on a Policy.
We will need a way for Polaris to embed some information into these views. I don’t think this is a P0 to make FGAC work, and I don’t think this necessarily needs to take the form of a SQL function. For example, it could be through a policy like: { “allow_columns”: [“a”, “b”], “transform_columns”: { { “col”: “c”, “predicate”: “some_func(x)”}, { “col”: “d”, “predicate: “${current_user} == admin” } } Perhaps we can try to get the first iteration of FGAC (with only field “allow_columns”) out first. Then, we can implement the engine-driven predicates (like that on column “c”). Finally, we can examine options for catalog-driven predicates like the one on column “d”. —EM On Tue, May 20, 2025 at 9:44 AM Robert Stupp <sn...@snazy.de> wrote: > I don't think that Polaris should change any view definition in any way, > but this is what the proposed approach does. > > The approach breaks the contract (behavior defined by the specification) > and in turn the observed behavior, absolute no go's. > > Practically speaking, the approach is blindly changing some string > without any knowledge about the actual meaning. But it has to know > exactly what it's doing - and to do that it has to know all the SQL > dialects. > > As a side note: every query engine already has information about the > user. I'm definitely not supporting exposing any authZ related information. > > FGAC as a feature is a great thing to have. But the proposed approach is > not the right way. > > > On 19.05.25 20:33, Prashant Singh wrote: > > Hey Robert, > > > > Thank you for your honest feedbacks, please let me try answering your > > concerns : > > > >> There are tons of SQL dialects out there, each requires its own fully > > implemented lexer/parser/interpreter > > That's true and we are not interpreting it either, we are just replacing > > the sql text wherever there is `is_principal('<principal_name>')` with > the > > value of TRUE and FALSE from the server end > > we are not re-interpreting or parsing the tree, i am assuming this is > what > > Analyzer already does in constant folding and boolean simplification, but > > yes post parsing, but IMHO i don't think it's an impossible thing to > > achieve. If it helps i am even fine is wrapping this as > > `{{is_principal('<principal_name>')}}` to make this very specific and let > > only Polaris work, IMHO we can work out the rough edges with the > > replacement. > > > >> for view containing view > > This should not be a problem, as we just replace the text of the current > > view definition when it comes to resolve the nested view it will issue > the > > same call of LOAD view but with the nested view identifier, when it will > be > > the call of nested view and that's when i we will do the replace > > we don't open the nested view in the definition during the loadView of > the > > parent, if that's the concern here, the nested view is treated equivalent > > to any other identifier which is opened / interpreted at later state of > > execution. > > > >> Exposing authZ information via any kind of publicly accessible API to > > every user sounds like an interesting source of information - especially > > for the "not so good and nice guys". > > > > Yes that's true and that's my intention it's just how we are delivering > the > > info, i.e i expose it by view definition itself (or by any other entity > > stored in Polaris) , but exposing this as an API would require engine > side > > integration too, which we as catalog have a very less control over as a > > catalog. > > > >> What's the benefit over having the ACLs on the table/view defined in the > > intended way? > > > > It's more from feature parity perspective and giving more control on view > > rather than just ACL (which are conjunctions) for ex if we just > complicate > > the view def with more predicated for ex disjunction > > > > select * from ns1.layer1_table where (condition1) OR > > (is_principal_role('ANALYST')) > > > > I would love to get your further feedback, considering the above. > > > > > > Best, > > Prashant Singh > > > > > > > > > > On Mon, May 19, 2025 at 11:04 AM Robert Stupp <sn...@snazy.de> wrote: > > > >> I'm brutally honest here: > >> > >> I think we should really stay away from interpreting SQL or any other > >> kind of (view) definition in Polaris. There are tons of SQL dialects out > >> there, each requires its own fully implemented lexer/parser/interpreter > >> - plus views-in-views-in-views-in-views... constructs requiring > >> resolution of nested views. It eventually ends in implementing > >> yet-another-query-engine. I doubt that this is doable with a > >> "java.lang.String.replace(from, to)" approach. > >> > >> Exposing authZ information via any kind of publicly accessible API to > >> every user sounds like an interesting source of information - especially > >> for the "not so good and nice guys". > >> > >> Regarding the examples: what's the benefit over having the the ACLs on > >> the table/view defined in the intended way? > >> > >> On 19.05.25 19:26, Prashant Singh wrote: > >>> Hi everyone, > >>> > >>> I’d like to propose adding *context-aware functions* to Apache Polaris > so > >>> that view definitions can resolve security context on the Polaris side > >> (aka > >>> catalog end without depending on engines). > >>> > >>> *Proposed functions* > >>> > >>> 1. > >>> > >>> *is_principal('<principal_name>')* – returns TRUE if the > >> authenticated > >>> principal matches <principal_name>, otherwise FALSE. > >>> 2. > >>> > >>> *is_principal_role('<principal_role_name>')* – returns TRUE when > >>> <principal_role_name> appears in the principal’s role set. > >>> 3. > >>> > >>> *is_catalog_role('<catalog_role_name>')* – analogous check at the > >>> catalog-role level. > >>> > >>> *Why it matters* > >>> > >>> These predicates make views dynamic. Example: > >>> > >>> CREATE VIEW dynamic_vw ASSELECT *FROM ns1.layer1_tableWHERE > >>> is_principal_role('ANALYST'); > >>> > >>> When a user whose one of principal roles include *ANALYST* calls LOAD > >>> VIEW, Polaris rewrites the view to > >>> > >>> > >>> - > >>> > >>> SELECT * FROM ns1.layer1_table WHERE TRUE; > >>> > >>> > >>> For everyone else the view becomes > >>> > >>> - > >>> > >>> SELECT * FROM ns1.layer1_table WHERE FALSE; > >>> > >>> > >>> The result is better and consistent control of the identity resolution > >>> without relying on the engine side changes and giving polaris more > >>> authority in enforcing things like FGAC (WIP by me). > >>> Note the same can be extrapolated to any Polaris stored entity. > >>> > >>> *Proof of concept* > >>> > >>> I’ve put together a quick POC branch: > >>> > >> > https://github.com/apache/polaris/compare/main...singhpk234:polaris:dyanmic/view > >>> *Prior art* > >>> > >>> Snowflake context functions : > >>> https://docs.snowflake.com/en/sql-reference/functions-context > >>> <https://docs.snowflake.com/en/sql-reference/functions-context> > >>> Databricks Unity Catalog offers a similar mechanism called *dynamic > >> views*: > >>> https://docs.databricks.com/aws/en/views/dynamic > >>> > >>> *Next steps* > >>> > >>> If the community is interested, we can discuss API surface, engine > >>> implications, and a roadmap for merging. > >>> > >>> Eager to hear your feedback! > >>> > >>> Best, > >>> Prashant Singh > >>> > >> -- > >> Robert Stupp > >> @snazy > >> > >> > -- > Robert Stupp > @snazy > >