Hi Aljoscha,

Thanks for the summary and these are great questions to be answered. The
answer to your first question is clear: there is a general agreement to
override built-in functions with temp functions.

However, your second and third questions are sort of related, as a function
reference can be either just function name (like "func") or in the form or
"cat.db.func". When a reference is just function name, it can mean either a
built-in function or a function defined in the current cat/db. If we
support overriding a built-in function with a temp function, such
overriding can also cover a function in the current cat/db.

I think what Timo referred as "overriding a catalog function" means a temp
function defined as "cat.db.func" overrides a catalog function "func" in
cat/db even if cat/db is not current. To support this, temp function has to
be tied to a cat/db. What's why I said above that the 2nd and 3rd questions
are related. The problem with such support is the ambiguity when user
defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func ...".
Here "func" can means a global temp function, or a temp function in current
cat/db. If we can assume the former, this creates an inconsistency because
"CREATE FUNCTION func" actually means a function in current cat/db. If we
assume the latter, then there is no way for user to create a global temp
function.

Giving a special namespace for built-in functions may solve the ambiguity
problem above, but it also introduces artificial catalog/database that
needs special treatment and pollutes the cleanness of  the code. I would
rather introduce a syntax in DDL to solve the problem, like "CREATE
[GLOBAL] TEMPORARY FUNCTION func".

Thus, I'd like to summarize a few candidate proposals for voting purposes:

1. Support only global, temporary functions without namespace. Such temp
functions overrides built-in functions and catalog functions in current
cat/db. The resolution order is: temp functions -> built-in functions ->
catalog functions. (Partially or fully qualified functions has no
ambiguity!)

2. In addition to #1, support creating and referencing temporary functions
associated with a cat/db with "GLOBAL" qualifier in DDL for global temp
functions. The resolution order is: global temp functions -> built-in
functions -> temp functions in current cat/db -> catalog function.
(Resolution for partially or fully qualified function reference is: temp
functions -> persistent functions.)

3. In addition to #1, support creating and referencing temporary functions
associated with a cat/db with a special namespace for built-in functions
and global temp functions. The resolution is the same as #2, except that
the special namespace might be prefixed to a reference to a built-in
function or global temp function. (In absence of the special namespace, the
resolution order is the same as in #2.)

My personal preference is #1, given the unknown use case and introduced
complexity for #2 and #3. However, #2 is an acceptable alternative. Thus,
my votes are:

+1 for #1
+0 for #2
-1 for #3

Everyone, please cast your vote (in above format please!), or let me know
if you have more questions or other candidates.

Thanks,
Xuefu







On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek <aljos...@apache.org>
wrote:

> Hi,
>
> I think this discussion and the one for FLIP-64 are very connected. To
> resolve the differences, think we have to think about the basic principles
> and find consensus there. The basic questions I see are:
>
>  - Do we want to support overriding builtin functions?
>  - Do we want to support overriding catalog functions?
>  - And then later: should temporary functions be tied to a
> catalog/database?
>
> I don’t have much to say about these, except that we should somewhat stick
> to what the industry does. But I also understand that the industry is
> already very divided on this.
>
> Best,
> Aljoscha
>
> > On 18. Sep 2019, at 11:41, Jark Wu <imj...@gmail.com> wrote:
> >
> > Hi,
> >
> > +1 to strive for reaching consensus on the remaining topics. We are
> close to the truth. It will waste a lot of time if we resume the topic some
> time later.
> >
> > +1 to “1-part/override” and I’m also fine with Timo’s “cat.db.fun” way
> to override a catalog function.
> >
> > I’m not sure about “system.system.fun”, it introduces a nonexistent cat
> & db? And we still need to do special treatment for the dedicated
> system.system cat & db?
> >
> > Best,
> > Jark
> >
> >
> >> 在 2019年9月18日,06:54,Timo Walther <twal...@apache.org> 写道:
> >>
> >> Hi everyone,
> >>
> >> @Xuefu: I would like to avoid adding too many things incrementally.
> Users should be able to override all catalog objects consistently according
> to FLIP-64 (Support for Temporary Objects in Table module). If functions
> are treated completely different, we need more code and special cases. From
> an implementation perspective, this topic only affects the lookup logic
> which is rather low implementation effort which is why I would like to
> clarify the remaining items. As you said, we have a slight consenus on
> overriding built-in functions; we should also strive for reaching consensus
> on the remaining topics.
> >>
> >> @Dawid: I like your idea as it ensures registering catalog objects
> consistent and the overriding of built-in functions more explicit.
> >>
> >> Thanks,
> >> Timo
> >>
> >>
> >> On 17.09.19 11:59, kai wang wrote:
> >>> hi, everyone
> >>> I think this flip is very meaningful. it supports functions that can be
> >>> shared by different catalogs and dbs, reducing the duplication of
> functions.
> >>>
> >>> Our group based on flink's sql parser module implements create function
> >>> feature, stores the parsed function metadata and schema into mysql, and
> >>> also customizes the catalog, customizes sql-client to support custom
> >>> schemas and functions. Loaded, but the function is currently global,
> and is
> >>> not subdivided according to catalog and db.
> >>>
> >>> In addition, I very much hope to participate in the development of this
> >>> flip, I have been paying attention to the community, but found it is
> more
> >>> difficult to join.
> >>> thank you.
> >>>
> >>> Xuefu Z <usxu...@gmail.com> 于2019年9月17日周二 上午11:19写道:
> >>>
> >>>> Thanks to Tmo and Dawid for sharing thoughts.
> >>>>
> >>>> It seems to me that there is a general consensus on having temp
> functions
> >>>> that have no namespaces and overwrite built-in functions. (As a side
> note
> >>>> for comparability, the current user defined functions are all
> temporary and
> >>>> having no namespaces.)
> >>>>
> >>>> Nevertheless, I can also see the merit of having namespaced temp
> functions
> >>>> that can overwrite functions defined in a specific cat/db. However,
> this
> >>>> idea appears orthogonal to the former and can be added incrementally.
> >>>>
> >>>> How about we first implement non-namespaced temp functions now and
> leave
> >>>> the door open for namespaced ones for later releases as the
> requirement
> >>>> might become more crystal? This also helps shorten the debate and
> allow us
> >>>> to make some progress along this direction.
> >>>>
> >>>> As to Dawid's idea of having a dedicated cat/db to host the temporary
> temp
> >>>> functions that don't have namespaces, my only concern is the special
> >>>> treatment for a cat/db, which makes code less clean, as evident in
> treating
> >>>> the built-in catalog currently.
> >>>>
> >>>> Thanks,
> >>>> Xuefiu
> >>>>
> >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz <
> >>>> wysakowicz.da...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi,
> >>>>> Another idea to consider on top of Timo's suggestion. How about we
> have a
> >>>>> special namespace (catalog + database) for built-in objects? This
> catalog
> >>>>> would be invisible for users as Xuefu was suggesting.
> >>>>>
> >>>>> Then users could still override built-in functions, if they fully
> qualify
> >>>>> object with the built-in namespace, but by default the common logic
> of
> >>>>> current dB & cat would be used.
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION func ...
> >>>>> registers temporary function in current cat & dB
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION cat.db.func ...
> >>>>> registers temporary function in cat db
> >>>>>
> >>>>> CREATE TEMPORARY FUNCTION system.system.func ...
> >>>>> Overrides built-in function with temporary function
> >>>>>
> >>>>> The built-in/system namespace would not be writable for permanent
> >>>> objects.
> >>>>> WDYT?
> >>>>>
> >>>>> This way I think we can have benefits of both solutions.
> >>>>>
> >>>>> Best,
> >>>>> Dawid
> >>>>>
> >>>>>
> >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <twal...@apache.org> wrote:
> >>>>>
> >>>>>> Hi Bowen,
> >>>>>>
> >>>>>> I understand the potential benefit of overriding certain built-in
> >>>>>> functions. I'm open to such a feature if many people agree.
> However, it
> >>>>>> would be great to still support overriding catalog functions with
> >>>>>> temporary functions in order to prototype a query even though a
> >>>>>> catalog/database might not be available currently or should not be
> >>>>>> modified yet. How about we support both cases?
> >>>>>>
> >>>>>> CREATE TEMPORARY FUNCTION abs
> >>>>>> -> creates/overrides a built-in function and never consideres
> current
> >>>>>> catalog and database; inconsistent with other DDL but acceptable for
> >>>>>> functions I guess.
> >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun
> >>>>>> -> creates/overrides a catalog function
> >>>>>>
> >>>>>> Regarding "Flink don't have any other built-in objects (tables,
> views)
> >>>>>> except functions", this might change in the near future. Take
> >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an example.
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Timo
> >>>>>>
> >>>>>> On 14.09.19 01:40, Bowen Li wrote:
> >>>>>>> Hi Fabian,
> >>>>>>>
> >>>>>>> Yes, I agree 1-part/no-override is the least favorable thus I
> didn't
> >>>>>>> include that as a voting option, and the discussion is mainly
> between
> >>>>>>> 1-part/override builtin and 3-part/not override builtin.
> >>>>>>>
> >>>>>>> Re > However, it means that temp functions are differently treated
> >>>> than
> >>>>>>> other db objects.
> >>>>>>> IMO, the treatment difference results from the fact that functions
> >>>> are
> >>>>> a
> >>>>>>> bit different from other objects - Flink don't have any other
> >>>> built-in
> >>>>>>> objects (tables, views) except functions.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Bowen
> >>>>>>>
> >>>>>>
> >>>>
> >>>> --
> >>>> Xuefu Zhang
> >>>>
> >>>> "In Honey We Trust!"
> >>>>
> >>
> >
>
>

-- 
Xuefu Zhang

"In Honey We Trust!"

Reply via email to