Hi Prashant, It seems to me that what you are trying to do is to store scalar UDF in the polaris catalog. Iceberg is the best place to standardize this and store it as Iceberg UDF (not Polaris).
Regarding the SQL syntax interoperability, we still haven't implemented this at Iceberg. Since UDF can be written as Python, it solves majority of the interop issues as many engines like snowflake, databricks, Trino can use Python UDF. There is a proposal at Iceberg on the same: https://lists.apache.org/thread/rcnpvclbq9658s0lt8wbrv3ob261y9cx The progress is a little slow due to lack of interest from the community. We can get back on track on this one. - Ajantha On Tue, May 20, 2025 at 2:41 AM Prashant Singh <prashant.si...@snowflake.com.invalid> wrote: > Hey JB, > > Thank you so much for the feedback, I would like to convince you, as to > what my thought process is, when i propose this : > > > not do query engine work, but more interact with any query engines for > ex: TMS > > I agree with this in principle, and we should specially not involve any > compute (for ex getting the orphan files, deleting etc) in the same JVM as > that of the catalog. > > but the intent here in this proposal is complementing the same, we are > trying to avoid what engines are trying to do what catalog should be doing > instead i.e resolving identity, authn and authZ > if we look this as way to authorize then this is something we can let the > catalog do and make engine just do query execution and not identity > resolution. > > > not be opinionated on SQL dialect > > Definitely, we don't want to be opinionated bases on SQL dialect, as much > as possible, but IMHO at-least for the case like this, where we want > catalog to resolve identity, > we can accommodate considering the value it brings it gives catalog more > authority towards identity resolution which is a very big problem and if we > scatter this around the engines we might lose that control as catalog, > for ex consider this is_principal_role('ANALYST') and i leave this to > engine what SQL will they evaluate ? they need to come to catalog anyways > saying that hey is my authenticated principal_role amongst > ANALYST ? There is no guarantee they will and even if we do, such a > contract where we want a catalog to evaluate a function can take ages to > get it due to scattered nature. > > > > I agree that Polaris would need to do "enforcement" but supporting any > query engines/SQL dialect is very difficult > > Definitely agree hence even if this means restricting this to very narrow > as `WHERE .....( clause)` is required, I am fine to do, just want to imbibe > as much enforcement as possible. > > > I think we should explore "abstraction" like Substrait or Coral to be > agnostic > > Definitely, but I think since the view itself doesn't have an IR, this is > not something that should be easily achievable, but I totally see where you > are coming from. I think even more fundamental is > who owns SQL to IR conversion nevertheless can all engines directly read > from IR. > > I agree that we need a clear boundary between engine and catalog and this > is where i am coming from as well, AuthZ just can't be an engine only when > things like identity is involved, we need to do this at catalog level to > have uniform enforcement. > > Please let me know your further thoughts. > > Best, > Prashant Singh > > > > > > > On Mon, May 19, 2025 at 12:21 PM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > > > Hi Prashant > > > > Thanks for the proposal. > > > > I understand the purpose (about FGAC which is something we plan to > > work on), but I'm not sure if it's a good approach with this kind of > > SQL functions. > > Polaris, as a catalog, should: > > 1. not do query engine work, but more interact with any query engines > > (same discussion we had about TMS) > > 2. not be opinionated on SQL dialect > > > > I agree that Polaris would need to do "enforcement" but supporting any > > query engines/SQL dialect is very difficult. I think we should explore > > "abstraction" like Substrait or Coral to be agnostic. > > I think Polaris should "integrate" query engines, with a clear > > boundary between what's query engine and catalog responsibility. > > > > I think the proposal has great value, but I'm not yet convinced by the > > impl approach. > > > > Regards > > JB > > > > On Mon, May 19, 2025 at 7:26 PM Prashant Singh > > <prashant.si...@snowflake.com.invalid> wrote: > > > > > > Hi everyone, > > > > > > I’d like to propose adding *context-aware functions* to Apache Polaris > so > > > that view definitions can resolve security context on the Polaris side > > (aka > > > catalog end without depending on engines). > > > > > > *Proposed functions* > > > > > > 1. > > > > > > *is_principal('<principal_name>')* – returns TRUE if the > authenticated > > > principal matches <principal_name>, otherwise FALSE. > > > 2. > > > > > > *is_principal_role('<principal_role_name>')* – returns TRUE when > > > <principal_role_name> appears in the principal’s role set. > > > 3. > > > > > > *is_catalog_role('<catalog_role_name>')* – analogous check at the > > > catalog-role level. > > > > > > *Why it matters* > > > > > > These predicates make views dynamic. Example: > > > > > > CREATE VIEW dynamic_vw ASSELECT *FROM ns1.layer1_tableWHERE > > > is_principal_role('ANALYST'); > > > > > > When a user whose one of principal roles include *ANALYST* calls LOAD > > > VIEW, Polaris rewrites the view to > > > > > > > > > - > > > > > > SELECT * FROM ns1.layer1_table WHERE TRUE; > > > > > > > > > For everyone else the view becomes > > > > > > - > > > > > > SELECT * FROM ns1.layer1_table WHERE FALSE; > > > > > > > > > The result is better and consistent control of the identity resolution > > > without relying on the engine side changes and giving polaris more > > > authority in enforcing things like FGAC (WIP by me). > > > Note the same can be extrapolated to any Polaris stored entity. > > > > > > *Proof of concept* > > > > > > I’ve put together a quick POC branch: > > > > > > https://github.com/apache/polaris/compare/main...singhpk234:polaris:dyanmic/view > > > > > > *Prior art* > > > > > > Snowflake context functions : > > > https://docs.snowflake.com/en/sql-reference/functions-context > > > <https://docs.snowflake.com/en/sql-reference/functions-context> > > > Databricks Unity Catalog offers a similar mechanism called *dynamic > > views*: > > > https://docs.databricks.com/aws/en/views/dynamic > > > > > > *Next steps* > > > > > > If the community is interested, we can discuss API surface, engine > > > implications, and a roadmap for merging. > > > > > > Eager to hear your feedback! > > > > > > Best, > > > Prashant Singh > > >