Thank you for the proposal, yuxia! The FLIP looks good to me. Best, Jark
> 2023年6月8日 11:39,yuxia <luoyu...@alumni.sjtu.edu.cn> 写道: > > Hi, all. > Thanks everyone for the valuable input. If there are are no further concerns > about this FLIP[1], I would like to start voting next monday (6/12). > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure > > > Best regards, > Yuxia > > ----- 原始邮件 ----- > 发件人: "Martijn Visser" <martijnvis...@apache.org> > 收件人: "dev" <dev@flink.apache.org> > 发送时间: 星期二, 2023年 6 月 06日 下午 3:57:56 > 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure > > Hi Yuxia, > > Thanks for the clarification. I would be +0 overall, because I think > without actually allowing creation/customization of stored procedures, the > value for the majority of Flink users will be minimal. > > Best regards, > > Martijn > > On Tue, Jun 6, 2023 at 3:52 AM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote: > >> Hi, Martijn. >> Thanks for you feedback. >> 1: In this FLIP we don't intend to allow users to customize their own >> stored procedure for we don't want to expose too much to users too early as >> the FLIP said. >> The procedures are supposed to be provided only by Catalog. Catalog devs >> can write their build-in procedures, and return the procedure in method >> Catalog.getProcedure(ObjectPath procedurePath); >> So, there won't be SQL syntax to create/save a stored procedure in this >> FLIP. If we find we do need it, we can propse the SQL syntax to create a >> stored procedure in another dedicated FLIP. >> >> 2: The syntax `Call procedure_name(xx)` proposed in this FLIP is the >> default syntax in Calcite for call stored procedures. Actaully, we don't >> need to do any modifcation in flink-sql-parser module for syntax of calling >> a procedure. MySQL[1], Postgres[2], Oracle[3] also use the syntax to call a >> stored procedure. >> >> >> [1] https://dev.mysql.com/doc/refman/8.0/en/call.html >> [2] https://www.postgresql.org/docs/15/sql-call.html >> [3] https://docs.oracle.com/javadb/10.8.3.0/ref/rrefcallprocedure.html >> >> Best regards, >> Yuxia >> >> ----- 原始邮件 ----- >> 发件人: "Martijn Visser" <martijnvis...@apache.org> >> 收件人: "dev" <dev@flink.apache.org> >> 发送时间: 星期一, 2023年 6 月 05日 下午 8:35:44 >> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure >> >> Hi Yuxia, >> >> Thanks for the FLIP. I have a couple of questions: >> >> 1. The syntax talks about how to CALL or SHOW the available stored >> procedures, but not on how to create one. Will there not be a SQL syntax to >> create/save a stored procedure? >> 2. Is there a default syntax in Calcite for stored procedures? What do >> other databases do, do they use CALL/SHOW or something like EXEC, USE? >> >> Best regards, >> >> Martijn >> >> On Mon, Jun 5, 2023 at 3:23 AM yuxia <luoyu...@alumni.sjtu.edu.cn> wrote: >> >>> Hi, Jane. >>> Thanks for you input. I think we can add the auxiliary command show >>> procedures in this FLIP. >>> Following the syntax for show functions proposed in FLIP-297. >>> The syntax will be >>> SHOW PROCEDURES [ ( FROM | IN ) [catalog_name.]database_name ] [ [NOT] >>> (LIKE | ILIKE) <sql_like_pattern> ]. >>> I have updated to this FLIP. >>> >>> The other auxiliary commands maybe not suitable currently or need a >>> further/dedicated dicussion. Let's keep this FLIP focus. >>> >>> [1] >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-297%3A+Improve+Auxiliary+Sql+Statements >>> >>> Best regards, >>> Yuxia >>> >>> ----- 原始邮件 ----- >>> 发件人: "Jane Chan" <qingyue....@gmail.com> >>> 收件人: "dev" <dev@flink.apache.org> >>> 发送时间: 星期六, 2023年 6 月 03日 下午 7:04:39 >>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure >>> >>> Hi Yuxia, >>> >>> Thanks for bringing this to the discussion. The call procedure is a >> widely >>> used feature and will be very useful for users. >>> >>> I just have one question regarding the usage. The FLIP mentioned that >>> >>> Flink will allow connector developers to develop their own built-in >> stored >>>> procedures, and then enables users to call these predefiend stored >>>> procedures. >>>> >>> In this FLIP, we don't intend to allow users to customize their own >> stored >>>> procedure for we don't want to expose too much to users too early. >>> >>> >>> If I understand correctly, we might need to provide some auxiliary >> commands >>> to inform users what built-in procedures are provided and how to use >> them. >>> For example, Snowflake provides commands like [1] [2], and MySQL provides >>> commands like [3] [4]. >>> >>> [1] SHOW PROCEDURES, >>> https://docs.snowflake.com/en/sql-reference/sql/show-procedures >>> [2] DESCRIBE PROCEDURE <procedure_name>, >>> https://docs.snowflake.com/en/sql-reference/sql/desc-procedure >>> [3] SHOW PROCEDURE CODE, >>> https://dev.mysql.com/doc/refman/5.7/en/show-procedure-code.html >>> [4] SHOW PROCEDURE STATUS, >>> https://dev.mysql.com/doc/refman/5.7/en/show-procedure-status.html >>> >>> Best, >>> Jane >>> >>> On Sat, Jun 3, 2023 at 3:20 PM Benchao Li <libenc...@apache.org> wrote: >>> >>>> Thanks Yuxia for the explanation, it makes sense to me. It would be >> great >>>> if you also add this to the FLIP doc. >>>> >>>> yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年6月1日周四 17:11写道: >>>> >>>>> Hi, Benchao. >>>>> Thanks for your attention. >>>>> >>>>> Initially, I also want to pass `TableEnvironment` to procedure. But >>>>> according my investegation and offline discussion with Jingson, the >>> real >>>>> important thing for procedure devs is the ability to build Flink >>>>> datastream. But we can't get the `StreamExecutionEnvironment` which >> is >>>> the >>>>> entrypoint to build datastream. That's to say we will lost the >> ability >>> to >>>>> build a datastream if we just pass `TableEnvironment`. >>>>> >>>>> Of course, we can also pass `TableEnvironment` along with >>>>> `StreamExecutionEnvironment` to Procedure. But I'm intend to be >>> cautious >>>>> about exposing too much too early to procedure devs. If someday we >> find >>>> we >>>>> will need `TableEnvironment` to custom a procedure, we can then add a >>>>> method like `getTableEnvironment()` in `ProcedureContext`. >>>>> >>>>> Best regards, >>>>> Yuxia >>>>> >>>>> ----- 原始邮件 ----- >>>>> 发件人: "Benchao Li" <libenc...@apache.org> >>>>> 收件人: "dev" <dev@flink.apache.org> >>>>> 发送时间: 星期四, 2023年 6 月 01日 下午 12:58:08 >>>>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure >>>>> >>>>> Thanks Yuxia for opening this discussion, >>>>> >>>>> The general idea looks good to me, I only have one question about the >>>>> `ProcedureContext#getExecutionEnvironment`. Why are you proposing to >>>> return >>>>> a `StreamExecutionEnvironment` instead of `TableEnvironment`, could >> you >>>>> elaborate a little more on this? >>>>> >>>>> Jingsong Li <jingsongl...@gmail.com> 于2023年5月30日周二 17:58写道: >>>>> >>>>>> Thanks for your explanation. >>>>>> >>>>>> We can support Iterable in future. Current design looks good to me. >>>>>> >>>>>> Best, >>>>>> Jingsong >>>>>> >>>>>> On Tue, May 30, 2023 at 4:56 PM yuxia <luoyu...@alumni.sjtu.edu.cn >>> >>>>> wrote: >>>>>>> >>>>>>> Hi, Jingsong. >>>>>>> Thanks for your feedback. >>>>>>> >>>>>>>> Does this need to be a function call? Do you have some example? >>>>>>> I think it'll be useful to support function call when user call >>>>>> procedure. >>>>>>> The following example is from iceberg:[1] >>>>>>> CALL catalog_name.system.migrate('spark_catalog.db.sample', >>>> map('foo', >>>>>> 'bar')); >>>>>>> >>>>>>> It allows user to use `map('foo', 'bar')` to pass a map data to >>>>>> procedure. >>>>>>> >>>>>>> Another case that I can imagine may be rollback a table to the >>>> snapshot >>>>>> of one week ago. >>>>>>> Then, with function call, user may call `rollback(table_name, >>> now() - >>>>>> INTERVAL '7' DAY)` to acheive such purpose. >>>>>>> >>>>>>> Although it can be function call, the eventual parameter got by >> the >>>>>> procedure will always be the literal evaluated. >>>>>>> >>>>>>> >>>>>>>> Procedure looks like a TableFunction, do you consider using >>>> Collector >>>>>>> something like TableFunction? (Supports large amount of data) >>>>>>> >>>>>>> Yes, I had considered it. But returns T[] is for simpility, >>>>>>> >>>>>>> First, regarding how to return the calling result of a procedure, >>> it >>>>>> looks more intuitive to me to use the return result of the `call` >>>> method >>>>>> instead of by calling something like collector#collect. >>>>>>> Introduce a collector will increase necessary complexity. >>>>>>> >>>>>>> Second, regarding supporting large amount of data, acoording my >>>>>> investagtion, I haven't seen the requirement that supports >> returning >>>>> large >>>>>> amount of data. >>>>>>> Iceberg also return an array.[2] If you do think we should >> support >>>>> large >>>>>> amount of data, I think we can change to return type from T[] to >>>>> Iterable<T> >>>>>>> >>>>>>> [1]: >>>> https://iceberg.apache.org/docs/latest/spark-procedures/#migrate >>>>>>> [2]: >>>>>> >>>>> >>>> >>> >> https://github.com/apache/iceberg/blob/601c5af9b6abded79dabeba177331310d5487f43/spark/v3.2/spark/src/main/java/org/apache/spark/sql/connector/iceberg/catalog/Procedure.java#L44 >>>>>>> >>>>>>> Best regards, >>>>>>> Yuxia >>>>>>> >>>>>>> ----- 原始邮件 ----- >>>>>>> 发件人: "Jingsong Li" <jingsongl...@gmail.com> >>>>>>> 收件人: "dev" <dev@flink.apache.org> >>>>>>> 发送时间: 星期一, 2023年 5 月 29日 下午 2:42:04 >>>>>>> 主题: Re: [DISCUSS] FLIP-311: Support Call Stored Procedure >>>>>>> >>>>>>> Thanks Yuxia for the proposal. >>>>>>> >>>>>>>> CALL [catalog_name.][database_name.]procedure_name ([ >> expression >>> [, >>>>>> expression]* ] ) >>>>>>> >>>>>>> The expression can be a function call. Does this need to be a >>>> function >>>>>>> call? Do you have some example? >>>>>>> >>>>>>>> Procedure returns T[] >>>>>>> >>>>>>> Procedure looks like a TableFunction, do you consider using >>> Collector >>>>>>> something like TableFunction? (Supports large amount of data) >>>>>>> >>>>>>> Best, >>>>>>> Jingsong >>>>>>> >>>>>>> On Mon, May 29, 2023 at 2:33 PM yuxia < >> luoyu...@alumni.sjtu.edu.cn >>>> >>>>>> wrote: >>>>>>>> >>>>>>>> Hi, everyone. >>>>>>>> >>>>>>>> I’d like to start a discussion about FLIP-311: Support Call >>> Stored >>>>>> Procedure [1] >>>>>>>> >>>>>>>> Stored procedure provides a convenient way to encapsulate >> complex >>>>>> logic to perform data manipulation or administrative tasks in >>> external >>>>>> storage systems. It's widely used in traditional databases and >>> popular >>>>>> compute engines like Trino for it's convenience. Therefore, we >>> propose >>>>>> adding support for call stored procedure in Flink to enable better >>>>>> integration with external storage systems. >>>>>>>> >>>>>>>> With this FLIP, Flink will allow connector developers to >> develop >>>>> their >>>>>> own built-in stored procedures, and then enables users to call >> these >>>>>> predefiend stored procedures. >>>>>>>> >>>>>>>> Looking forward to your feedbacks. >>>>>>>> >>>>>>>> [1]: >>>>>> >>>>> >>>> >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-311%3A+Support+Call+Stored+Procedure >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Yuxia >>>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Best, >>>>> Benchao Li >>>>> >>>> >>>> >>>> -- >>>> >>>> Best, >>>> Benchao Li >>>> >>> >>