Re: Context-Aware Functions for Apache Polaris

Alex Dutra Tue, 17 Jun 2025 05:57:14 -0700

Hi all,

Catching up quite late with this discussion thread, my apologies.
Trying to sum up my impressions so far:

We should definitely avoid parsing SQL and altering view definitions.
Instead it seems that Eric's proposal of defining FGAC policies with a
clear syntax and spec is a much more robust approach.

We definitely should focus initially on the 3 main purposes of FGAC
policies: row filtering, column filtering and column masking.

Prashant's initial proposal seems to hint at the fact that some
function calls in FGAC policies could be constant-folded server-side.
E.g. the expression "IF has_role('admin')" in an FGAC policy body
could be inlined to FALSE by the server, depending on the current
user. This is probably what Eric meant by "catalog-driven predicates".

But not all function calls will be constant-foldable, and some of them
will have to be interpreted on the engine side (or, as Eric puts it,
"engine-driven predicates"). Then the question is: how would an engine
be able to incorporate a function/predicate like "has_role('admin')"
into the query plan?

That is imho a very complex topic. It probably requires the FGAC
specification to settle on a set of "standard" or "built-in" functions
that all REST catalogs must support. But, such functions cannot be
pure, as the function result also depends on the context, not only on
the function arguments (IOW: has_role('admin') can return TRUE or
FALSE). Thus, I'm not sure we can model these functions as (scalar)
UDFs, since for now the UDF specification defines scalar UDFs as
"based on predefined logic". We might need a different concept here.

That said, we could maybe start working on FGAC with support for just
a few common, constant-foldable expressions in FGAC policies, and go
from there.

My 2c,

Alex

On Sat, Jun 7, 2025 at 12:34 AM Prashant Singh
<prashant.si...@snowflake.com.invalid> wrote:
>
> In terms of achieving this I was thinking of this as the following as well
> :
> 1. role based column filtering as -> This should be achievable, without a
> lot of lift shift.
> 2. simple row based filters
> 3. column mask and row mask (when Iceberg UDF are established)
> And agree we just need to store the mask name and we can send it back to
> the engine in many ways,  but i think for mask to mean something for all
> engines we need Iceberg UDF's for example email_mask means XYZ .... across
> Engine A, Engine B, how we achieve this if by IR or just storing it it in
> its individual dialect that still need some brainstorming, but I think in
> principle engine should come back to catalog for getting the function
> definition would imho be the best bet ! So maybe tackling the row mask and
> column mask when UDF are first class citizens might be best ,Though
> starting with 1 seems the simplest !
> Given that I am working on a proposal doc for incorporating these thoughts
> for Polaris, I will share with the community *soon* and would love to get
> all of your feedback  !
>
> Best,
> Prashant Singh
>
>
> On Fri, Jun 6, 2025 at 2:48 PM Eric Maynard <eric.w.mayn...@gmail.com>
> wrote:
>
> > It seems to me that the *easiest* to start with would be role-based column
> > filtering. There are no functions to grapple with, no dialect differences.
> > You simply grab the list of columns that a given principal role has access
> > to according to the FGAC policy attached to a given table.
> >
> > --EM
> >

Re: Context-Aware Functions for Apache Polaris

Reply via email to