@Dawid, Re: we also don't need additional referencing the specialcatalog anywhere.
True. But once we allow such reference, then user can do so in any possible place where a function name is expected, for which we have to handle. That's a big difference, I think. Thanks, Xuefu On Wed, Sep 18, 2019 at 5:25 PM Dawid Wysakowicz <wysakowicz.da...@gmail.com> wrote: > @Bowen I am not suggesting introducing additional catalog. I think we need > to get rid of the current built-in catalog. > > @Xuefu in option #3 we also don't need additional referencing the special > catalog anywhere else besides in the CREATE statement. The resolution > behaviour is exactly the same in both options. > > On Thu, 19 Sep 2019, 08:17 Xuefu Z, <usxu...@gmail.com> wrote: > > > Hi Dawid, > > > > "GLOBAL" is a temporary keyword that was given to the approach. It can be > > changed to something else for better. > > > > The difference between this and the #3 approach is that we only need the > > keyword for this create DDL. For other places (such as function > > referencing), no keyword or special namespace is needed. > > > > Thanks, > > Xuefu > > > > On Wed, Sep 18, 2019 at 4:32 PM Dawid Wysakowicz < > > wysakowicz.da...@gmail.com> > > wrote: > > > > > Hi, > > > I think it makes sense to start voting at this point. > > > > > > Option 1: Only 1-part identifiers > > > PROS: > > > - allows shadowing built-in functions > > > CONS: > > > - incosistent with all the other objects, both permanent & temporary > > > - does not allow shadowing catalog functions > > > > > > Option 2: Special keyword for built-in function > > > I think this is quite similar to the special catalog/db. The thing I am > > > strongly against in this proposal is the GLOBAL keyword. This keyword > > has a > > > meaning in rdbms systems and means a function that is present for a > > > lifetime of a session in which it was created, but available in all > other > > > sessions. Therefore I really don't want to use this keyword in a > > different > > > context. > > > > > > Option 3: Special catalog/db > > > > > > PROS: > > > - allows shadowing built-in functions > > > - allows shadowing catalog functions > > > - consistent with other objects > > > CONS: > > > - we introduce a special namespace for built-in functions > > > > > > I don't see a problem with introducing the special namespace. In the > end > > it > > > is very similar to the keyword approach. In this case the catalog/db > > > combination would be the "keyword" > > > > > > Therefore my votes: > > > Option 1: -0 > > > Option 2: -1 (I might change to +0 if we can come up with a better > > keyword) > > > Option 3: +1 > > > > > > Best, > > > Dawid > > > > > > > > > On Thu, 19 Sep 2019, 05:12 Xuefu Z, <usxu...@gmail.com> wrote: > > > > > > > Hi Aljoscha, > > > > > > > > Thanks for the summary and these are great questions to be answered. > > The > > > > answer to your first question is clear: there is a general agreement > to > > > > override built-in functions with temp functions. > > > > > > > > However, your second and third questions are sort of related, as a > > > function > > > > reference can be either just function name (like "func") or in the > form > > > or > > > > "cat.db.func". When a reference is just function name, it can mean > > > either a > > > > built-in function or a function defined in the current cat/db. If we > > > > support overriding a built-in function with a temp function, such > > > > overriding can also cover a function in the current cat/db. > > > > > > > > I think what Timo referred as "overriding a catalog function" means a > > > temp > > > > function defined as "cat.db.func" overrides a catalog function "func" > > in > > > > cat/db even if cat/db is not current. To support this, temp function > > has > > > to > > > > be tied to a cat/db. What's why I said above that the 2nd and 3rd > > > questions > > > > are related. The problem with such support is the ambiguity when user > > > > defines a function w/o namespace, "CREATE TEMPORARY FUNCTION func > ...". > > > > Here "func" can means a global temp function, or a temp function in > > > current > > > > cat/db. If we can assume the former, this creates an inconsistency > > > because > > > > "CREATE FUNCTION func" actually means a function in current cat/db. > If > > we > > > > assume the latter, then there is no way for user to create a global > > temp > > > > function. > > > > > > > > Giving a special namespace for built-in functions may solve the > > ambiguity > > > > problem above, but it also introduces artificial catalog/database > that > > > > needs special treatment and pollutes the cleanness of the code. I > > would > > > > rather introduce a syntax in DDL to solve the problem, like "CREATE > > > > [GLOBAL] TEMPORARY FUNCTION func". > > > > > > > > Thus, I'd like to summarize a few candidate proposals for voting > > > purposes: > > > > > > > > 1. Support only global, temporary functions without namespace. Such > > temp > > > > functions overrides built-in functions and catalog functions in > current > > > > cat/db. The resolution order is: temp functions -> built-in functions > > -> > > > > catalog functions. (Partially or fully qualified functions has no > > > > ambiguity!) > > > > > > > > 2. In addition to #1, support creating and referencing temporary > > > functions > > > > associated with a cat/db with "GLOBAL" qualifier in DDL for global > temp > > > > functions. The resolution order is: global temp functions -> built-in > > > > functions -> temp functions in current cat/db -> catalog function. > > > > (Resolution for partially or fully qualified function reference is: > > temp > > > > functions -> persistent functions.) > > > > > > > > 3. In addition to #1, support creating and referencing temporary > > > functions > > > > associated with a cat/db with a special namespace for built-in > > functions > > > > and global temp functions. The resolution is the same as #2, except > > that > > > > the special namespace might be prefixed to a reference to a built-in > > > > function or global temp function. (In absence of the special > namespace, > > > the > > > > resolution order is the same as in #2.) > > > > > > > > My personal preference is #1, given the unknown use case and > introduced > > > > complexity for #2 and #3. However, #2 is an acceptable alternative. > > Thus, > > > > my votes are: > > > > > > > > +1 for #1 > > > > +0 for #2 > > > > -1 for #3 > > > > > > > > Everyone, please cast your vote (in above format please!), or let me > > know > > > > if you have more questions or other candidates. > > > > > > > > Thanks, > > > > Xuefu > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 18, 2019 at 6:42 AM Aljoscha Krettek < > aljos...@apache.org> > > > > wrote: > > > > > > > > > Hi, > > > > > > > > > > I think this discussion and the one for FLIP-64 are very connected. > > To > > > > > resolve the differences, think we have to think about the basic > > > > principles > > > > > and find consensus there. The basic questions I see are: > > > > > > > > > > - Do we want to support overriding builtin functions? > > > > > - Do we want to support overriding catalog functions? > > > > > - And then later: should temporary functions be tied to a > > > > > catalog/database? > > > > > > > > > > I don’t have much to say about these, except that we should > somewhat > > > > stick > > > > > to what the industry does. But I also understand that the industry > is > > > > > already very divided on this. > > > > > > > > > > Best, > > > > > Aljoscha > > > > > > > > > > > On 18. Sep 2019, at 11:41, Jark Wu <imj...@gmail.com> wrote: > > > > > > > > > > > > Hi, > > > > > > > > > > > > +1 to strive for reaching consensus on the remaining topics. We > are > > > > > close to the truth. It will waste a lot of time if we resume the > > topic > > > > some > > > > > time later. > > > > > > > > > > > > +1 to “1-part/override” and I’m also fine with Timo’s > “cat.db.fun” > > > way > > > > > to override a catalog function. > > > > > > > > > > > > I’m not sure about “system.system.fun”, it introduces a > nonexistent > > > cat > > > > > & db? And we still need to do special treatment for the dedicated > > > > > system.system cat & db? > > > > > > > > > > > > Best, > > > > > > Jark > > > > > > > > > > > > > > > > > >> 在 2019年9月18日,06:54,Timo Walther <twal...@apache.org> 写道: > > > > > >> > > > > > >> Hi everyone, > > > > > >> > > > > > >> @Xuefu: I would like to avoid adding too many things > > incrementally. > > > > > Users should be able to override all catalog objects consistently > > > > according > > > > > to FLIP-64 (Support for Temporary Objects in Table module). If > > > functions > > > > > are treated completely different, we need more code and special > > cases. > > > > From > > > > > an implementation perspective, this topic only affects the lookup > > logic > > > > > which is rather low implementation effort which is why I would like > > to > > > > > clarify the remaining items. As you said, we have a slight consenus > > on > > > > > overriding built-in functions; we should also strive for reaching > > > > consensus > > > > > on the remaining topics. > > > > > >> > > > > > >> @Dawid: I like your idea as it ensures registering catalog > objects > > > > > consistent and the overriding of built-in functions more explicit. > > > > > >> > > > > > >> Thanks, > > > > > >> Timo > > > > > >> > > > > > >> > > > > > >> On 17.09.19 11:59, kai wang wrote: > > > > > >>> hi, everyone > > > > > >>> I think this flip is very meaningful. it supports functions > that > > > can > > > > be > > > > > >>> shared by different catalogs and dbs, reducing the duplication > of > > > > > functions. > > > > > >>> > > > > > >>> Our group based on flink's sql parser module implements create > > > > function > > > > > >>> feature, stores the parsed function metadata and schema into > > mysql, > > > > and > > > > > >>> also customizes the catalog, customizes sql-client to support > > > custom > > > > > >>> schemas and functions. Loaded, but the function is currently > > > global, > > > > > and is > > > > > >>> not subdivided according to catalog and db. > > > > > >>> > > > > > >>> In addition, I very much hope to participate in the development > > of > > > > this > > > > > >>> flip, I have been paying attention to the community, but found > it > > > is > > > > > more > > > > > >>> difficult to join. > > > > > >>> thank you. > > > > > >>> > > > > > >>> Xuefu Z <usxu...@gmail.com> 于2019年9月17日周二 上午11:19写道: > > > > > >>> > > > > > >>>> Thanks to Tmo and Dawid for sharing thoughts. > > > > > >>>> > > > > > >>>> It seems to me that there is a general consensus on having > temp > > > > > functions > > > > > >>>> that have no namespaces and overwrite built-in functions. (As > a > > > side > > > > > note > > > > > >>>> for comparability, the current user defined functions are all > > > > > temporary and > > > > > >>>> having no namespaces.) > > > > > >>>> > > > > > >>>> Nevertheless, I can also see the merit of having namespaced > temp > > > > > functions > > > > > >>>> that can overwrite functions defined in a specific cat/db. > > > However, > > > > > this > > > > > >>>> idea appears orthogonal to the former and can be added > > > > incrementally. > > > > > >>>> > > > > > >>>> How about we first implement non-namespaced temp functions now > > and > > > > > leave > > > > > >>>> the door open for namespaced ones for later releases as the > > > > > requirement > > > > > >>>> might become more crystal? This also helps shorten the debate > > and > > > > > allow us > > > > > >>>> to make some progress along this direction. > > > > > >>>> > > > > > >>>> As to Dawid's idea of having a dedicated cat/db to host the > > > > temporary > > > > > temp > > > > > >>>> functions that don't have namespaces, my only concern is the > > > special > > > > > >>>> treatment for a cat/db, which makes code less clean, as > evident > > in > > > > > treating > > > > > >>>> the built-in catalog currently. > > > > > >>>> > > > > > >>>> Thanks, > > > > > >>>> Xuefiu > > > > > >>>> > > > > > >>>> On Mon, Sep 16, 2019 at 5:07 PM Dawid Wysakowicz < > > > > > >>>> wysakowicz.da...@gmail.com> > > > > > >>>> wrote: > > > > > >>>> > > > > > >>>>> Hi, > > > > > >>>>> Another idea to consider on top of Timo's suggestion. How > about > > > we > > > > > have a > > > > > >>>>> special namespace (catalog + database) for built-in objects? > > This > > > > > catalog > > > > > >>>>> would be invisible for users as Xuefu was suggesting. > > > > > >>>>> > > > > > >>>>> Then users could still override built-in functions, if they > > fully > > > > > qualify > > > > > >>>>> object with the built-in namespace, but by default the common > > > logic > > > > > of > > > > > >>>>> current dB & cat would be used. > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION func ... > > > > > >>>>> registers temporary function in current cat & dB > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION cat.db.func ... > > > > > >>>>> registers temporary function in cat db > > > > > >>>>> > > > > > >>>>> CREATE TEMPORARY FUNCTION system.system.func ... > > > > > >>>>> Overrides built-in function with temporary function > > > > > >>>>> > > > > > >>>>> The built-in/system namespace would not be writable for > > permanent > > > > > >>>> objects. > > > > > >>>>> WDYT? > > > > > >>>>> > > > > > >>>>> This way I think we can have benefits of both solutions. > > > > > >>>>> > > > > > >>>>> Best, > > > > > >>>>> Dawid > > > > > >>>>> > > > > > >>>>> > > > > > >>>>> On Tue, 17 Sep 2019, 07:24 Timo Walther, <twal...@apache.org > > > > > > wrote: > > > > > >>>>> > > > > > >>>>>> Hi Bowen, > > > > > >>>>>> > > > > > >>>>>> I understand the potential benefit of overriding certain > > > built-in > > > > > >>>>>> functions. I'm open to such a feature if many people agree. > > > > > However, it > > > > > >>>>>> would be great to still support overriding catalog functions > > > with > > > > > >>>>>> temporary functions in order to prototype a query even > though > > a > > > > > >>>>>> catalog/database might not be available currently or should > > not > > > be > > > > > >>>>>> modified yet. How about we support both cases? > > > > > >>>>>> > > > > > >>>>>> CREATE TEMPORARY FUNCTION abs > > > > > >>>>>> -> creates/overrides a built-in function and never > consideres > > > > > current > > > > > >>>>>> catalog and database; inconsistent with other DDL but > > acceptable > > > > for > > > > > >>>>>> functions I guess. > > > > > >>>>>> CREATE TEMPORARY FUNCTION cat.db.fun > > > > > >>>>>> -> creates/overrides a catalog function > > > > > >>>>>> > > > > > >>>>>> Regarding "Flink don't have any other built-in objects > > (tables, > > > > > views) > > > > > >>>>>> except functions", this might change in the near future. > Take > > > > > >>>>>> https://issues.apache.org/jira/browse/FLINK-13900 as an > > > example. > > > > > >>>>>> > > > > > >>>>>> Thanks, > > > > > >>>>>> Timo > > > > > >>>>>> > > > > > >>>>>> On 14.09.19 01:40, Bowen Li wrote: > > > > > >>>>>>> Hi Fabian, > > > > > >>>>>>> > > > > > >>>>>>> Yes, I agree 1-part/no-override is the least favorable > thus I > > > > > didn't > > > > > >>>>>>> include that as a voting option, and the discussion is > mainly > > > > > between > > > > > >>>>>>> 1-part/override builtin and 3-part/not override builtin. > > > > > >>>>>>> > > > > > >>>>>>> Re > However, it means that temp functions are differently > > > > treated > > > > > >>>> than > > > > > >>>>>>> other db objects. > > > > > >>>>>>> IMO, the treatment difference results from the fact that > > > > functions > > > > > >>>> are > > > > > >>>>> a > > > > > >>>>>>> bit different from other objects - Flink don't have any > other > > > > > >>>> built-in > > > > > >>>>>>> objects (tables, views) except functions. > > > > > >>>>>>> > > > > > >>>>>>> Cheers, > > > > > >>>>>>> Bowen > > > > > >>>>>>> > > > > > >>>>>> > > > > > >>>> > > > > > >>>> -- > > > > > >>>> Xuefu Zhang > > > > > >>>> > > > > > >>>> "In Honey We Trust!" > > > > > >>>> > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Xuefu Zhang > > > > > > > > "In Honey We Trust!" > > > > > > > > > > > > > -- > > Xuefu Zhang > > > > "In Honey We Trust!" > > > -- Xuefu Zhang "In Honey We Trust!"