Thanks, Ajantha.

I'm skeptical about whether it's a good idea to add UDFs tracked by Iceberg
catalogs. I think that Iceberg primarily deals with things that are
centralized, like tables of data. While it would be great to have a common
set of functions across engines, I don't see how that is practical when
those engines are implemented so differently. Plugging in code -- and
especially custom user-supplied code -- seems inherently specialized to me
and should be part of the engines' design.

I guess we'll know more when you post the proposal, but I think this would
be a very difficult area to tackle across engines, languages, and memory
models without having a huge performance penalty.


On Fri, May 24, 2024 at 8:10 AM Ajantha Bhat <> wrote:

> Hi Everyone,
> This is a discussion to gauge the community interest in storing the
> Versioned SQL UDFs in Iceberg.
> We want to propose the spec addition for storing the versioned UDFs in
> Iceberg (inspired by view spec).
> These UDFs can operate similarly to views in that they are associated with
> tables, but they can accept arguments and produce return values, or even
> function as inline expressions.
> Many Query engines like Dremio, Trino, Snowflake, Databricks Spark
> supports SQL UDFs at catalog level [1].
> But storing them in Iceberg can enable
> - Versioning of these UDFs.
> - Interoperability between the engines. Potentially engines can understand
> the UDFs written by other engines (with the translate layer).
> We believe that integrating this feature into Iceberg would be a valuable
> addition, and we're eager to collaborate with the community to develop a
> UDF specification.
> Stephen <> has already begun drafting a
> specification to propose to the community.
> Let us know your thoughts on this.
> [1]
> Dremio -
> Trino -
> Snowflake -
> Databricks -
> - Ajantha

Ryan Blue

Reply via email to