Thanks Bowen for driving this. +1 for the general idea. It makes the function resolved behavior more clear and deterministic. Besides, the user can use all hive built-in functions, which is a great feature.
I only have one comment, but maybe it may touch your design so I think it would make sense to reply this mail instead of comment on google doc. Regarding to the classfication of functions, you currently have 4 types of functions, which are: 1. temporary functions 2. Flink built-in functions 3. Hive built-in functions (or generalized as external built-in functions) 4. catalog functions What I want to propose is we can merge #3 and #4, make them both under "catalog" concept, by extending catalog function to make it have ability to have built-in catalog functions. Some benefits I can see from this approach: 1. We don't have to introduce new concept like external built-in functions. Actually I don't see a full story about how to treat a built-in functions, and it seems a little bit disrupt with catalog. As a result, you have to make some restriction like "hive built-in functions can only be used when current catalog is hive catalog". 2. It makes us easier to adopt another system's built-in functions to Flink, such as MySQL. If we don't treat uniformly with "external built-in functions" and "external catalog function", things like user set current catalog to hive but want to use MySQL's built-in function will happen. One more thing, follow this approach, it's clear for your question about how to support external built-in functions, which is "add a getBuiltInFunction to current Catalog API". What do you think? Best, Kurt On Fri, Aug 30, 2019 at 7:14 AM Bowen Li <bowenl...@gmail.com> wrote: > Thanks everyone for the feedback. > > I have updated the document accordingly. Here're the summary of changes: > > - clarify the concept of temporary functions, to facilitate deciding > function resolution order > - provide two options to support Hive built-in functions, with the 2nd one > being preferred > - add detailed prototype code for FunctionCatalog#lookupFunction(name) > - move the section of ”rename existing FunctionCatalog APIs in favor of > temporary functions“ out of the scope of the FLIP > - add another reasonable limitation for function resolution, to not > consider resolving overloaded functions - those with the same name but > different params. (It's still valid to have a single function with > overloaded eval() methods) > > Please take another look. > > Thanks, > Bowen > > On Tue, Aug 27, 2019 at 11:49 AM Bowen Li <bowenl...@gmail.com> wrote: > > > Hi folks, > > > > I'd like to kick off a discussion on reworking Flink's FunctionCatalog. > > It's critically helpful to improve function usability in SQL. > > > > > > > https://docs.google.com/document/d/1w3HZGj9kry4RsKVCduWp82HkW6hhgi2unnvOAUS72t8/edit?usp=sharing > > > > In short, it: > > - adds support for precise function reference with fully/partially > > qualified name > > - redefines function resolution order for ambiguous function reference > > - adds support for Hive's rich built-in functions (support for Hive user > > defined functions was already added in 1.9.0) > > - clarifies the concept of temporary functions > > > > Would love to hear your thoughts. > > > > Bowen > > >