Hi Wei,
Thanks a lot for drafting the FLIP and kicking off the discussion.
Big +1 for this feature.
This feature will greatly facilitate PyFlink users to use Python UDF in SQL
scenarios.

Best,
Xingbo

Hequn Cheng <he...@apache.org> 于2020年3月13日周五 下午5:10写道:

> Big +1 on this feature! It would be great to extend the usage of Python UDF
> in SQL scenarios.
> The design doc looks good from my side now. Thank you for the update.
>
> Best,
> Hequn
>
> On Tue, Mar 10, 2020 at 3:50 PM Wei Zhong <weizhong0...@gmail.com> wrote:
>
> > Hi Timo,
> >
> > Thanks for your reply.
> >
> > If we aim for the option 1, it makes sense for me to include the change
> in
> > this FLIP as the option 1 does not change any public API. I'll update the
> > FLIP page to illustrate this.
> >
> > Best,
> > Wei
> >
> > > 在 2020年3月9日,17:58,Timo Walther <twal...@apache.org> 写道:
> > >
> > > Hi Wei,
> > >
> > > I agree with Dawid that we should defer the instantiation of temporary
> > functions to compile time. In the long-term, we would like to integrate
> > FunctionCatalog as a component of CatalogManager and unify the handling
> of
> > catalog objects as much as possible.
> > >
> > > We should aim for your proposed option 1. For fluent definition of
> > functions in Table API, we would still like to offer passing instances
> like
> > `t.select(call(new ScalarFunction() { ... }))` that would be registered
> as
> > temporary system functions.
> > >
> > > Regrds,
> > > Timo
> > >
> > >
> > > On 09.03.20 09:24, Wei Zhong wrote:
> > >> Hi Dawid,
> > >> I think defering the instantiation of temporary functions to compile
> > time is quite a good idea but needs further discussion. As it is
> orthogonal
> > with this FLIP, we could continue the discussion in a new thread later.
> > What do you think?
> > >> Best,
> > >> Wei
> > >>> 在 2020年3月5日,21:11,Wei Zhong <weizhong0...@gmail.com> 写道:
> > >>>
> > >>> Hi Dawid,
> > >>>
> > >>> Thanks for your suggestion.
> > >>>
> > >>> After some investigation, there are two designs in my mind about how
> > to defer the instantiation of temporary system function and temporary
> > catalog function to compile time.
> > >>>
> > >>> 1. FunctionCatalog accepts both FunctionDefinitions and
> uninstantiated
> > temporary functions. The uninstantiated temporary functions will be
> > instantiated when compiling. There is no public API change in this
> design,
> > but the FunctionCatalog needs to store and process both
> FunctionDefinitions
> > and uninstantiated temporary functions.
> > >>>
> > >>> 2. FunctionCatalog accepts only uninstantiated temporary functions.
> In
> > this design we need to remove those APIs that accepts FunctionDefinitions
> > from TableEnvironment, i.e. `void createTemporaryFunction(String path,
> > UserDefinedFunction functionInstance)` and `void
> > createTemporarySystemFunction(String name, UserDefinedFunction
> > functionInstance)`. But the FunctionCatalog only needs to store and
> process
> > uninstantiated temporary functions.
> > >>>
> > >>> As I don't know the details about the plan to store temporary
> > functions as catalog functions instead of FunctionDefinitions, I'm not
> sure
> > which solution fits more. It would be great if you could share more
> details
> > or share some thoughts on these two solutions?
> > >>>
> > >>> Best,
> > >>> Wei
> > >>>
> > >>>> 在 2020年3月4日,16:17,Dawid Wysakowicz <dwysakow...@apache.org> 写道:
> > >>>>
> > >>>> Hi all,
> > >>>> I had a really quick look and from my perspective the proposal looks
> > fine.
> > >>>> I share Jarks opinion that the instantiation could be done at a
> later
> > >>>> stage. I agree with Wei it requires some changes in the internal
> > >>>> implementation of the FunctionCatalog, to store temporary functions
> as
> > >>>> catalog functions instead of FunctionDefinitions, but we have that
> on
> > our
> > >>>> agenda anyway. I would suggest investigating if we could do that as
> > part of
> > >>>> this flip already. Nevertheless this in theory can be also done
> later.
> > >>>>
> > >>>> Best,
> > >>>> Dawid
> > >>>>
> > >>>> On Mon, 2 Mar 2020, 14:58 Jark Wu, <imj...@gmail.com> wrote:
> > >>>>
> > >>>>> Thanks for the explanation, Wei!
> > >>>>>
> > >>>>> On Mon, 2 Mar 2020 at 20:59, Wei Zhong <weizhong0...@gmail.com>
> > wrote:
> > >>>>>
> > >>>>>> Hi Jark,
> > >>>>>>
> > >>>>>> Thanks for your suggestion.
> > >>>>>>
> > >>>>>> Actually, the timing of starting a Python process depends on the
> UDF
> > >>>>> type,
> > >>>>>> because the Python process is used to provide the necessary
> > information
> > >>>>> to
> > >>>>>> instantiate the FunctionDefinition object of the Python UDF. For
> > catalog
> > >>>>>> function, the FunctionDefinition will be instantiated when
> > compiling the
> > >>>>>> job, which means the Python process is required during the
> > compilation
> > >>>>>> instead of the registeration. For temporary system function and
> > temporary
> > >>>>>> catalog function, the FunctionDefinition will be instantiated
> > during the
> > >>>>>> UDF registeration, so the Python process need to be started at
> that
> > time.
> > >>>>>>
> > >>>>>> But this FLIP will only support registering the temporary system
> > function
> > >>>>>> and temporary catalog function in SQL DDL because registering
> > Python UDF
> > >>>>> to
> > >>>>>> catalog is not supported yet. We plan to support the registeration
> > of
> > >>>>>> Python catalog function (via Table API and SQL DDL) in a separate
> > FLIP.
> > >>>>>> I'll add a non-goal section to the FLIP page to illustrate this.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Wei
> > >>>>>>
> > >>>>>>
> > >>>>>>> 在 2020年3月2日,15:11,Jark Wu <imj...@gmail.com> 写道:
> > >>>>>>>
> > >>>>>>> Hi Weizhong,
> > >>>>>>>
> > >>>>>>> Thanks for proposing this feature. In geneal, I'm +1 from the
> > table's
> > >>>>>> view.
> > >>>>>>>
> > >>>>>>> I have one suggestion: I think the register python function into
> > >>>>> catalog
> > >>>>>>> doesn't need to startup python process (the "High Level Sequence
> > >>>>> Diagram"
> > >>>>>>> in your FLIP).
> > >>>>>>> Because only meta-information is persisted into catalog, we don't
> > need
> > >>>>> to
> > >>>>>>> store "return type", "input types" into catalog.
> > >>>>>>> I guess the python process is required when compiling a SQL job.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Jark
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On Fri, 28 Feb 2020 at 19:04, Benchao Li <libenc...@gmail.com>
> > wrote:
> > >>>>>>>
> > >>>>>>>> Big +1 for this feature.
> > >>>>>>>>
> > >>>>>>>> We built our SQL platform on Java Table API, and most common UDF
> > are
> > >>>>>>>> implemented in Java. However some python developers are not
> > familiar
> > >>>>>> with
> > >>>>>>>> Java/Scala, and it's very inconvenient for these users to use
> UDF
> > in
> > >>>>>> SQL.
> > >>>>>>>>
> > >>>>>>>> Wei Zhong <weizhong0...@gmail.com> 于2020年2月28日周五 下午6:58写道:
> > >>>>>>>>
> > >>>>>>>>> Thank for your reply Dan!
> > >>>>>>>>>
> > >>>>>>>>> By the way, this FLIP is closely related to the SQL API.  @Jark
> > Wu <
> > >>>>>>>>> imj...@gmail.com> @Timo <twal...@apache.org> could you please
> > take a
> > >>>>>>>>> look?
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>> Wei
> > >>>>>>>>>
> > >>>>>>>>>> 在 2020年2月25日,16:25,zoudan <zoud...@163.com> 写道:
> > >>>>>>>>>>
> > >>>>>>>>>> +1 for supporting Python UDF in Java/Scala Table API.
> > >>>>>>>>>> This is a great feature and would be helpful for python users!
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Dan Zou
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>> --
> > >>>>>>>>
> > >>>>>>>> Benchao Li
> > >>>>>>>> School of Electronics Engineering and Computer Science, Peking
> > >>>>>> University
> > >>>>>>>> Tel:+86-15650713730
> > >>>>>>>> Email: libenc...@gmail.com; libenc...@pku.edu.cn
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >
> >
> >
>

Reply via email to