As I mentioned in my previous reply:
"FGAC is a very complex topic. The right way would be to have a holistic design and agreed-on approach. That does take time."

The proposed approach changes the observed behavior. It makes it impossible to change the view later on. Plus the other concerns I mentioned.

I really want to get FGAC into Polaris - but well thought through and interoperable - considering all use cases and requirements.

Let's work together on a proper design, but please let's not start with partial implementations.

On 20.05.25 18:27, Prashant Singh wrote:
Hey Robert,
I believe you are quoting Iceberg view spec :
https://iceberg.apache.org/view-spec/#versions
  1. All representations for a version should express the same underlying
definition (This holds true )
  2. Immutable View versions : if the concern is that we are using the same
view version, we can always generate a polaris generated view version, and
include these representation, this is an implementation detail
Note : The spec doesn't say who can generate the view representation as
very well you can do it with JAVA api, so IMHO we are not in violation of
spec if we create a new view version.
  the approach is blindly changing some string without any knowledge about
the actual meaning.
I think I clearly called out it's a *POC* in my pr if that's what is being
quoted as the end solution, I am happy to work rough edges, though I think
if you strictly define your return type as boolean you can hold the
accountability to the view definer
if this string match leads to broken user experience, I would request to
objectively evaluate this idea of resolving the identities in the Polaris
side that's all I really wanna request for as its a very unorganized world
to unify spark's current_user() to
is_prinicpal() in polaris.

I hope this answers your concern. I am totally open to any recommendation
and work rough edges, let's solve this problem, together as a community !

Best,
Prashant Singh

On Tue, May 20, 2025 at 9:06 AM Robert Stupp <sn...@snazy.de> wrote:

This proposal _does_ change the view definition - it returns a
_different_ representation than the one that has been stored before.
This is a change that breaks the contract of the specification and it
changes the observed behavior.

FGAC is a very complex topic. The right way would be to have a holistic
design and agreed-on approach. That does take time.

On 20.05.25 16:47, Eric Maynard wrote:
I wouldn’t say that Polaris is changing a view definition, but per my
understanding Polaris is actually generating a view based on a Policy.

We will need a way for Polaris to embed some information into these
views.
I don’t think this is a P0 to make FGAC work, and I don’t think this
necessarily needs to take the form of a SQL function. For example, it
could
be through a policy like:

{
    “allow_columns”: [“a”, “b”],
    “transform_columns”: {
      {
        “col”: “c”,
        “predicate”: “some_func(x)”},
      {
        “col”: “d”,
        “predicate: “${current_user} == admin”
    }
}

Perhaps we can try to get the first iteration of FGAC (with only field
“allow_columns”) out first. Then, we can implement the engine-driven
predicates (like that on column “c”). Finally, we can examine options for
catalog-driven predicates like the one on column “d”.

—EM

On Tue, May 20, 2025 at 9:44 AM Robert Stupp <sn...@snazy.de> wrote:

I don't think that Polaris should change any view definition in any way,
but this is what the proposed approach does.

The approach breaks the contract (behavior defined by the specification)
and in turn the observed behavior, absolute no go's.

Practically speaking, the approach is blindly changing some string
without any knowledge about the actual meaning. But it has to know
exactly what it's doing - and to do that it has to know all the SQL
dialects.

As a side note: every query engine already has information about the
user. I'm definitely not supporting exposing any authZ related
information.
FGAC as a feature is a great thing to have. But the proposed approach is
not the right way.


On 19.05.25 20:33, Prashant Singh wrote:
Hey Robert,

Thank you for your honest feedbacks, please let me try answering your
concerns :

There are tons of SQL dialects out there, each requires its own fully
implemented lexer/parser/interpreter
That's true and we are not interpreting it either, we are just
replacing
the sql text wherever there is `is_principal('<principal_name>')` with
the
value of TRUE and FALSE from the server end
we are not re-interpreting or parsing the tree, i am assuming this is
what
Analyzer already does in constant folding and boolean simplification,
but
yes post parsing, but IMHO i don't think it's an impossible thing to
achieve. If it helps i am even fine is wrapping this as
`{{is_principal('<principal_name>')}}` to make this very specific and
let
only Polaris work, IMHO we can work out the rough edges with the
replacement.

for view containing view
This should not be a problem, as we just replace the text of the
current
view definition when it comes to resolve the nested view it will issue
the
same call of LOAD view but with the nested view identifier, when it
will
be
the call of nested view and that's when i we will do the replace
we don't open the nested view in the definition during the loadView of
the
parent, if that's the concern here, the nested view is treated
equivalent
to any other identifier which is opened / interpreted at later state of
execution.

Exposing authZ information via any kind of publicly accessible API to
every user sounds like an interesting source of information -
especially
for the "not so good and nice guys".

Yes that's true and that's my intention it's just how we are delivering
the
info, i.e i expose it by view definition itself (or by any other entity
stored in Polaris) , but exposing this as an API would require engine
side
integration too, which we as catalog have a very less control over as a
catalog.

What's the benefit over having the ACLs on the table/view defined in
the
intended way?

It's more from feature parity perspective and giving more control on
view
rather than just ACL (which are conjunctions) for ex if we just
complicate
the view def with more predicated for ex disjunction

select * from ns1.layer1_table where (condition1) OR
(is_principal_role('ANALYST'))

I would love to get your further feedback, considering the above.


Best,
Prashant Singh




On Mon, May 19, 2025 at 11:04 AM Robert Stupp <sn...@snazy.de> wrote:

I'm brutally honest here:

I think we should really stay away from interpreting SQL or any other
kind of (view) definition in Polaris. There are tons of SQL dialects
out
there, each requires its own fully implemented
lexer/parser/interpreter
- plus views-in-views-in-views-in-views... constructs requiring
resolution of nested views. It eventually ends in implementing
yet-another-query-engine. I doubt that this is doable with a
"java.lang.String.replace(from, to)" approach.

Exposing authZ information via any kind of publicly accessible API to
every user sounds like an interesting source of information -
especially
for the "not so good and nice guys".

Regarding the examples: what's the benefit over having the the ACLs on
the table/view defined in the intended way?

On 19.05.25 19:26, Prashant Singh wrote:
Hi everyone,

I’d like to propose adding *context-aware functions* to Apache
Polaris
so
that view definitions can resolve security context on the Polaris
side
(aka
catalog end without depending on engines).

*Proposed functions*

       1.

       *is_principal('<principal_name>')* – returns TRUE if the
authenticated
       principal matches <principal_name>, otherwise FALSE.
       2.

       *is_principal_role('<principal_role_name>')* – returns TRUE
when
       <principal_role_name> appears in the principal’s role set.
       3.

       *is_catalog_role('<catalog_role_name>')* – analogous check at
the
       catalog-role level.

*Why it matters*

These predicates make views dynamic. Example:

CREATE VIEW dynamic_vw ASSELECT *FROM ns1.layer1_tableWHERE
is_principal_role('ANALYST');

When a user whose one of principal roles include *ANALYST* calls LOAD
VIEW, Polaris rewrites the view to


       -

       SELECT * FROM ns1.layer1_table WHERE TRUE;


For everyone else the view becomes

       -

       SELECT * FROM ns1.layer1_table WHERE FALSE;


The result is better and consistent control of the identity
resolution
without relying on the engine side changes and giving polaris more
authority in enforcing things like FGAC (WIP by me).
Note the same can be extrapolated to any Polaris stored entity.

*Proof of concept*

I’ve put together a quick POC branch:

https://github.com/apache/polaris/compare/main...singhpk234:polaris:dyanmic/view
*Prior art*

Snowflake context functions :
     https://docs.snowflake.com/en/sql-reference/functions-context
<https://docs.snowflake.com/en/sql-reference/functions-context>
Databricks Unity Catalog offers a similar mechanism called *dynamic
views*:
https://docs.databricks.com/aws/en/views/dynamic

*Next steps*

If the community is interested, we can discuss API surface, engine
implications, and a roadmap for merging.

Eager to hear your feedback!

Best,
Prashant Singh

--
Robert Stupp
@snazy


--
Robert Stupp
@snazy


--
Robert Stupp
@snazy


--
Robert Stupp
@snazy

Reply via email to