Hi Jingsong,

Re> 1.Hive built-in functions is an intermediate solution. So we should
> not introduce interfaces to influence the framework. To make
> Flink itself more powerful, we should implement the functions
> we need to add.

Yes, please see the doc.

Re> 2.Non-flink built-in functions are easy for users to change their
> behavior. If we support some flink built-in functions in the
> future but act differently from non-flink built-in, this will lead to
> changes in user behavior.

There's no such concept as "external built-in functions" any more. Built-in
functions of external systems will be treated as special catalog functions.

Re> Another question is, does this fallback include all
> hive built-in functions? As far as I know, some hive functions
> have some hacky. If possible, can we start with a white list?
> Once we implement some functions to flink built-in, we can
> also update the whitelist.

Yes, that's something we thought of too. I don't think it's super critical
to the scope of this FLIP, thus I'd like to leave it to future efforts as a
nice-to-have feature.


On Tue, Sep 3, 2019 at 1:37 PM Bowen Li <bowenl...@gmail.com> wrote:

> Hi Kurt,
>
> Re: > What I want to propose is we can merge #3 and #4, make them both
> under
> >"catalog" concept, by extending catalog function to make it have ability
> to
> >have built-in catalog functions. Some benefits I can see from this
> approach:
> >1. We don't have to introduce new concept like external built-in
> functions.
> >Actually I don't see a full story about how to treat a built-in
> functions, and it
> >seems a little bit disrupt with catalog. As a result, you have to make
> some restriction
> >like "hive built-in functions can only be used when current catalog is
> hive catalog".
>
> Yes, I've unified #3 and #4 but it seems I didn't update some part of the
> doc. I've modified those sections, and they are up to date now.
>
> In short, now built-in function of external systems are defined as a
> special kind of catalog function in Flink, and handled by Flink as
> following:
> - An external built-in function must be associated with a catalog for the
> purpose of decoupling flink-table and external systems.
> - It always resides in front of catalog functions in ambiguous function
> reference order, just like in its own external system
> - It is a special catalog function that doesn’t have a schema/database
> namespace
> - It goes thru the same instantiation logic as other user defined catalog
> functions in the external system
>
> Please take another look at the doc, and let me know if you have more
> questions.
>
>
> On Tue, Sep 3, 2019 at 7:28 AM Timo Walther <twal...@apache.org> wrote:
>
>> Hi Kurt,
>>
>> it should not affect the functions and operations we currently have in
>> SQL. It just categorizes the available built-in functions. It is kind of
>> an orthogonal concept to the catalog API but built-in functions deserve
>> this special kind of treatment. CatalogFunction still fits perfectly in
>> there because the regular catalog object resolution logic is not
>> affected. So tables and functions are resolved in the same way but with
>> built-in functions that have priority as in the original design.
>>
>> Regards,
>> Timo
>>
>>
>> On 03.09.19 15:26, Kurt Young wrote:
>> > Does this only affect the functions and operations we currently have in
>> SQL
>> > and
>> > have no effect on tables, right? Looks like this is an orthogonal
>> concept
>> > with Catalog?
>> > If the answer are both yes, then the catalog function will be a weird
>> > concept?
>> >
>> > Best,
>> > Kurt
>> >
>> >
>> > On Tue, Sep 3, 2019 at 8:10 PM Danny Chan <yuzhao....@gmail.com> wrote:
>> >
>> >> The way you proposed are basically the same as what Calcite does, I
>> think
>> >> we are in the same line.
>> >>
>> >> Best,
>> >> Danny Chan
>> >> 在 2019年9月3日 +0800 PM7:57,Timo Walther <twal...@apache.org>,写道:
>> >>> This sounds exactly as the module approach I mentioned, no?
>> >>>
>> >>> Regards,
>> >>> Timo
>> >>>
>> >>> On 03.09.19 13:42, Danny Chan wrote:
>> >>>> Thanks Bowen for bring up this topic, I think it’s a useful
>> >> refactoring to make our function usage more user friendly.
>> >>>> For the topic of how to organize the builtin operators and operators
>> >> of Hive, here is a solution from Apache Calcite, the Calcite way is to
>> make
>> >> every dialect operators a “Library”, user can specify which libraries
>> they
>> >> want to use for a sql query. The builtin operators always comes as the
>> >> first class objects and the others are used from the order they
>> appears.
>> >> Maybe you can take a reference.
>> >>>> [1]
>> >>
>> https://github.com/apache/calcite/commit/9a4eab5240d96379431d14a1ac33bfebaf6fbb28
>> >>>> Best,
>> >>>> Danny Chan
>> >>>> 在 2019年8月28日 +0800 AM2:50,Bowen Li <bowenl...@gmail.com>,写道:
>> >>>>> Hi folks,
>> >>>>>
>> >>>>> I'd like to kick off a discussion on reworking Flink's
>> >> FunctionCatalog.
>> >>>>> It's critically helpful to improve function usability in SQL.
>> >>>>>
>> >>>>>
>> >>
>> https://docs.google.com/document/d/1w3HZGj9kry4RsKVCduWp82HkW6hhgi2unnvOAUS72t8/edit?usp=sharing
>> >>>>> In short, it:
>> >>>>> - adds support for precise function reference with fully/partially
>> >>>>> qualified name
>> >>>>> - redefines function resolution order for ambiguous function
>> >> reference
>> >>>>> - adds support for Hive's rich built-in functions (support for Hive
>> >> user
>> >>>>> defined functions was already added in 1.9.0)
>> >>>>> - clarifies the concept of temporary functions
>> >>>>>
>> >>>>> Would love to hear your thoughts.
>> >>>>>
>> >>>>> Bowen
>> >>>
>>
>>

Reply via email to