Hi Jingsong, Thanks for the input. The FLINK function DDL definitely needs to align with HQL, I updated the doc accordingly. CREATE FUNCTION [db_name.]function_name AS class_name [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ];
For you other questions below: 1) how to load resources for function. (How to deal with jar/file/archive) Let about consider the jar from the beginning. For file and archive, I will do more study on the Hive side. The basic idea of loading jar without dependency conflicts is to use separate class loaders for different sessions. I updated doc with the interface change required to achieve the goal. 2) how to pass properties to function. It can be an setProperties function in UDF interface or a constructor with Map with parameters. As Bowen comments on the doc, I think we probably just need to let customers provide such a constructor if they want to use properties in DDL. 3) How does python udf work? It is not in the scope of this FLIP. I think the FLIP 78 will provide the runtime support. Somehow, we just need to bridge the DDL with their runtime interface. But yes, this part needs to be added. But probably in the next phase after the MVP is done. Best Regards Peter Huang On Thu, Oct 24, 2019 at 11:07 PM Jingsong Li <jingsongl...@gmail.com> wrote: > Hi Peter, > > Thanks for your proposal. The first thing I care about most is whether it > can cover the needs of hive. > Hive create function: > > CREATE FUNCTION [db_name.]function_name AS class_name > [USING JAR|FILE|ARCHIVE 'file_uri' [, JAR|FILE|ARCHIVE 'file_uri'] ]; > > Hive support a list of resources, and support jar/file/archive, Maybe we > need users to tell us exactly what kind of resources are. So we can see > whether to add it to the ClassLoader or other processing? > > +1 for the internal implementation as timo said, like: > - how to load resources for function. (How to deal with jar/file/archive) > - how to pass properties to function. > - How does python udf work? Hive use Transform command to run shell and > python. It would be better if we could make clear how to do. > > Hope to get your reply~ > > Best, > Jingsong Lee > > On Thu, Oct 24, 2019 at 5:14 PM Timo Walther <twal...@apache.org> wrote: > > > Hi Peter, > > > > thanks for your proposal. I left some comments in the FLIP document. I > > agree with Terry that we can have a MVP in Flink 1.10 but should already > > discuss the bigger picture as a DDL string cannot be changed easily once > > released. > > > > In particular we should discuss how resources for function are loaded. > > If they are simply added to the JobGraph they are available to all > > functions and could potentially interfere with each other, right? > > > > Thanks, > > Timo > > > > > > > > On 24.10.19 05:32, Terry Wang wrote: > > > Hi Peter, > > > > > > Sorry late to reply. Thanks for your efforts on this and I just looked > > through your design. > > > I left some comments in the doc about alter function section and > > function catalog interface. > > > IMO, the overall design is ok and we can discuss further more about > some > > details. > > > I also think it’s necessary to have this awesome feature limit to basic > > function (of course better to have all :) ) in 1.10 release. > > > > > > Best, > > > Terry Wang > > > > > > > > > > > >> 2019年10月16日 14:19,Peter Huang <huangzhenqiu0...@gmail.com> 写道: > > >> > > >> Hi Xuefu, > > >> > > >> Thank you for the feedback. I think you are pointing out a similar > > concern > > >> with Bowen. Let me describe > > >> how the catalog function and function factory will be changed in the > > >> implementation section. > > >> Then, we can have more discussion in detail. > > >> > > >> > > >> Best Regards > > >> Peter Huang > > >> > > >> On Tue, Oct 15, 2019 at 4:18 PM Xuefu Z <usxu...@gmail.com> wrote: > > >> > > >>> Thanks to Peter for the proposal! > > >>> > > >>> I left some comments in the google doc. Besides what Bowen pointed > > out, I'm > > >>> unclear about how things work end to end from the document. For > > instance, > > >>> SQL DDL-like function definition is mentioned. I guess just having a > > DDL > > >>> for it doesn't explain how it's supported functionally. I think it's > > better > > >>> to have some clarification on what is expected work and what's for > the > > >>> future. > > >>> > > >>> Thanks, > > >>> Xuefu > > >>> > > >>> > > >>> On Tue, Oct 15, 2019 at 11:05 AM Bowen Li <bowenl...@gmail.com> > wrote: > > >>> > > >>>> Hi Zhenqiu, > > >>>> > > >>>> Thanks for taking on this effort! > > >>>> > > >>>> A couple questions: > > >>>> - Though this FLIP is about function DDL, can we also think about > how > > the > > >>>> created functions can be mapped to CatalogFunction and see if we > need > > to > > >>>> modify CatalogFunction interface? Syntax changes need to be backed > by > > the > > >>>> backend. > > >>>> - Can we define a clearer, smaller scope targeting for Flink 1.10 > > among > > >>> all > > >>>> the proposed changes? The current overall scope seems to be quite > > wide, > > >>> and > > >>>> it may be unrealistic to get everything in a single release, or > even a > > >>>> couple. However, I believe the most common user story can be > > something as > > >>>> simple as "being able to create and persist a java class-based udf > and > > >>> use > > >>>> it later in queries", which will add great value for most Flink > users > > and > > >>>> is achievable in 1.10. > > >>>> > > >>>> Bowen > > >>>> > > >>>> On Sun, Oct 13, 2019 at 10:46 PM Peter Huang < > > huangzhenqiu0...@gmail.com > > >>>> > > >>>> wrote: > > >>>> > > >>>>> Dear Community, > > >>>>> > > >>>>> FLIP-79 Flink Function DDL Support > > >>>>> < > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/16kkHlis80s61ifnIahCj-0IEdy5NJ1z-vGEJd_JuLog/edit# > > >>>>>> > > >>>>> > > >>>>> This proposal aims to support function DDL with the consideration > of > > >>> SQL > > >>>>> syntax, language compliance, and advanced external UDF lib > > >>> registration. > > >>>>> The Flink DDL is initialized and discussed in the design > > >>>>> < > > >>>>> > > >>>> > > >>> > > > https://docs.google.com/document/d/1TTP-GCC8wSsibJaSUyFZ_5NBAHYEB1FVmPpP7RgDGBA/edit#heading=h.wpsqidkaaoil > > >>>>>> > > >>>>> [1] by Shuyi Chen and Timo. As the initial discussion mainly > focused > > on > > >>>> the > > >>>>> table, type and view. FLIP-69 [2] extend it with a more detailed > > >>>> discussion > > >>>>> of DDL for catalog, database, and function. Original the function > DDL > > >>> was > > >>>>> under the scope of FLIP-69. After some discussion > > >>>>> <https://issues.apache.org/jira/browse/FLINK-7151> with the > > community, > > >>>> we > > >>>>> found that there are several ongoing efforts, such as FLIP-64 [3], > > >>>> FLIP-65 > > >>>>> [4], and FLIP-78 [5]. As they will directly impact the SQL syntax > of > > >>>>> function DDL, the proposal wants to describe the problem clearly > with > > >>> the > > >>>>> consideration of existing works and make sure the design aligns > with > > >>>>> efforts of API change of temporary objects and type inference for > UDF > > >>>>> defined by different languages. > > >>>>> > > >>>>> The FlLIP outlines the requirements from related works, and > propose a > > >>> SQL > > >>>>> syntax to meet those requirements. The corresponding implementation > > is > > >>>> also > > >>>>> discussed. Please kindly review and give feedback. > > >>>>> > > >>>>> > > >>>>> Best Regards > > >>>>> Peter Huang > > >>>>> > > >>>> > > >>> > > >>> > > >>> -- > > >>> Xuefu Zhang > > >>> > > >>> "In Honey We Trust!" > > >>> > > > > > > -- > Best, Jingsong Lee >