Big +1 on this feature! It would be great to extend the usage of Python UDF
in SQL scenarios.
The design doc looks good from my side now. Thank you for the update.

Best,
Hequn

On Tue, Mar 10, 2020 at 3:50 PM Wei Zhong <weizhong0...@gmail.com> wrote:

> Hi Timo,
>
> Thanks for your reply.
>
> If we aim for the option 1, it makes sense for me to include the change in
> this FLIP as the option 1 does not change any public API. I'll update the
> FLIP page to illustrate this.
>
> Best,
> Wei
>
> > 在 2020年3月9日,17:58,Timo Walther <twal...@apache.org> 写道:
> >
> > Hi Wei,
> >
> > I agree with Dawid that we should defer the instantiation of temporary
> functions to compile time. In the long-term, we would like to integrate
> FunctionCatalog as a component of CatalogManager and unify the handling of
> catalog objects as much as possible.
> >
> > We should aim for your proposed option 1. For fluent definition of
> functions in Table API, we would still like to offer passing instances like
> `t.select(call(new ScalarFunction() { ... }))` that would be registered as
> temporary system functions.
> >
> > Regrds,
> > Timo
> >
> >
> > On 09.03.20 09:24, Wei Zhong wrote:
> >> Hi Dawid,
> >> I think defering the instantiation of temporary functions to compile
> time is quite a good idea but needs further discussion. As it is orthogonal
> with this FLIP, we could continue the discussion in a new thread later.
> What do you think?
> >> Best,
> >> Wei
> >>> 在 2020年3月5日,21:11,Wei Zhong <weizhong0...@gmail.com> 写道:
> >>>
> >>> Hi Dawid,
> >>>
> >>> Thanks for your suggestion.
> >>>
> >>> After some investigation, there are two designs in my mind about how
> to defer the instantiation of temporary system function and temporary
> catalog function to compile time.
> >>>
> >>> 1. FunctionCatalog accepts both FunctionDefinitions and uninstantiated
> temporary functions. The uninstantiated temporary functions will be
> instantiated when compiling. There is no public API change in this design,
> but the FunctionCatalog needs to store and process both FunctionDefinitions
> and uninstantiated temporary functions.
> >>>
> >>> 2. FunctionCatalog accepts only uninstantiated temporary functions. In
> this design we need to remove those APIs that accepts FunctionDefinitions
> from TableEnvironment, i.e. `void createTemporaryFunction(String path,
> UserDefinedFunction functionInstance)` and `void
> createTemporarySystemFunction(String name, UserDefinedFunction
> functionInstance)`. But the FunctionCatalog only needs to store and process
> uninstantiated temporary functions.
> >>>
> >>> As I don't know the details about the plan to store temporary
> functions as catalog functions instead of FunctionDefinitions, I'm not sure
> which solution fits more. It would be great if you could share more details
> or share some thoughts on these two solutions?
> >>>
> >>> Best,
> >>> Wei
> >>>
> >>>> 在 2020年3月4日,16:17,Dawid Wysakowicz <dwysakow...@apache.org> 写道:
> >>>>
> >>>> Hi all,
> >>>> I had a really quick look and from my perspective the proposal looks
> fine.
> >>>> I share Jarks opinion that the instantiation could be done at a later
> >>>> stage. I agree with Wei it requires some changes in the internal
> >>>> implementation of the FunctionCatalog, to store temporary functions as
> >>>> catalog functions instead of FunctionDefinitions, but we have that on
> our
> >>>> agenda anyway. I would suggest investigating if we could do that as
> part of
> >>>> this flip already. Nevertheless this in theory can be also done later.
> >>>>
> >>>> Best,
> >>>> Dawid
> >>>>
> >>>> On Mon, 2 Mar 2020, 14:58 Jark Wu, <imj...@gmail.com> wrote:
> >>>>
> >>>>> Thanks for the explanation, Wei!
> >>>>>
> >>>>> On Mon, 2 Mar 2020 at 20:59, Wei Zhong <weizhong0...@gmail.com>
> wrote:
> >>>>>
> >>>>>> Hi Jark,
> >>>>>>
> >>>>>> Thanks for your suggestion.
> >>>>>>
> >>>>>> Actually, the timing of starting a Python process depends on the UDF
> >>>>> type,
> >>>>>> because the Python process is used to provide the necessary
> information
> >>>>> to
> >>>>>> instantiate the FunctionDefinition object of the Python UDF. For
> catalog
> >>>>>> function, the FunctionDefinition will be instantiated when
> compiling the
> >>>>>> job, which means the Python process is required during the
> compilation
> >>>>>> instead of the registeration. For temporary system function and
> temporary
> >>>>>> catalog function, the FunctionDefinition will be instantiated
> during the
> >>>>>> UDF registeration, so the Python process need to be started at that
> time.
> >>>>>>
> >>>>>> But this FLIP will only support registering the temporary system
> function
> >>>>>> and temporary catalog function in SQL DDL because registering
> Python UDF
> >>>>> to
> >>>>>> catalog is not supported yet. We plan to support the registeration
> of
> >>>>>> Python catalog function (via Table API and SQL DDL) in a separate
> FLIP.
> >>>>>> I'll add a non-goal section to the FLIP page to illustrate this.
> >>>>>>
> >>>>>> Best,
> >>>>>> Wei
> >>>>>>
> >>>>>>
> >>>>>>> 在 2020年3月2日,15:11,Jark Wu <imj...@gmail.com> 写道:
> >>>>>>>
> >>>>>>> Hi Weizhong,
> >>>>>>>
> >>>>>>> Thanks for proposing this feature. In geneal, I'm +1 from the
> table's
> >>>>>> view.
> >>>>>>>
> >>>>>>> I have one suggestion: I think the register python function into
> >>>>> catalog
> >>>>>>> doesn't need to startup python process (the "High Level Sequence
> >>>>> Diagram"
> >>>>>>> in your FLIP).
> >>>>>>> Because only meta-information is persisted into catalog, we don't
> need
> >>>>> to
> >>>>>>> store "return type", "input types" into catalog.
> >>>>>>> I guess the python process is required when compiling a SQL job.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jark
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Fri, 28 Feb 2020 at 19:04, Benchao Li <libenc...@gmail.com>
> wrote:
> >>>>>>>
> >>>>>>>> Big +1 for this feature.
> >>>>>>>>
> >>>>>>>> We built our SQL platform on Java Table API, and most common UDF
> are
> >>>>>>>> implemented in Java. However some python developers are not
> familiar
> >>>>>> with
> >>>>>>>> Java/Scala, and it's very inconvenient for these users to use UDF
> in
> >>>>>> SQL.
> >>>>>>>>
> >>>>>>>> Wei Zhong <weizhong0...@gmail.com> 于2020年2月28日周五 下午6:58写道:
> >>>>>>>>
> >>>>>>>>> Thank for your reply Dan!
> >>>>>>>>>
> >>>>>>>>> By the way, this FLIP is closely related to the SQL API.  @Jark
> Wu <
> >>>>>>>>> imj...@gmail.com> @Timo <twal...@apache.org> could you please
> take a
> >>>>>>>>> look?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Wei
> >>>>>>>>>
> >>>>>>>>>> 在 2020年2月25日,16:25,zoudan <zoud...@163.com> 写道:
> >>>>>>>>>>
> >>>>>>>>>> +1 for supporting Python UDF in Java/Scala Table API.
> >>>>>>>>>> This is a great feature and would be helpful for python users!
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Dan Zou
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> --
> >>>>>>>>
> >>>>>>>> Benchao Li
> >>>>>>>> School of Electronics Engineering and Computer Science, Peking
> >>>>>> University
> >>>>>>>> Tel:+86-15650713730
> >>>>>>>> Email: libenc...@gmail.com; libenc...@pku.edu.cn
> >>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>
> >
>
>

Reply via email to