Hi Jincheng and Dian, Sorry for being late to the party. I took a glance at the proposal, LGTM in general, and I left only a couple comments.
Thanks, Bowen On Mon, Aug 26, 2019 at 8:05 PM Dian Fu <dian0511...@gmail.com> wrote: > Hi Jincheng, > > Thanks! It works. > > Thanks, > Dian > > > 在 2019年8月27日,上午10:55,jincheng sun <sunjincheng...@gmail.com> 写道: > > > > Hi Dian, can you check if you have edit access? :) > > > > > > Dian Fu <dian0511...@gmail.com> 于2019年8月26日周一 上午10:52写道: > > > >> Hi Jincheng, > >> > >> Appreciated for the kind tips and offering of help. Definitely need it! > >> Could you grant me write permission for confluence? My Id: Dian Fu > >> > >> Thanks, > >> Dian > >> > >>> 在 2019年8月26日,上午9:53,jincheng sun <sunjincheng...@gmail.com> 写道: > >>> > >>> Thanks for your feedback Hequn & Dian. > >>> > >>> Dian, I am glad to see that you want help to create the FLIP! > >>> Everyone will have first time, and I am very willing to help you > complete > >>> your first FLIP creation. Here some tips: > >>> > >>> - First I'll give your account write permission for confluence. > >>> - Before create the FLIP, please have look at the FLIP Template [1], > >> (It's > >>> better to know more about FLIP by reading [2]) > >>> - Create Flink Python UDFs related JIRAs after completing the VOTE of > >>> FLIP.(I think you also can bring up the VOTE thread, if you want! ) > >>> > >>> Any problems you encounter during this period,feel free to tell me that > >> we > >>> can solve them together. :) > >>> > >>> Best, > >>> Jincheng > >>> > >>> > >>> > >>> > >>> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP+Template > >>> [2] > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/Flink+Improvement+Proposals > >>> > >>> > >>> Hequn Cheng <chenghe...@gmail.com> 于2019年8月23日周五 上午11:54写道: > >>> > >>>> +1 for starting the vote. > >>>> > >>>> Thanks Jincheng a lot for the discussion. > >>>> > >>>> Best, Hequn > >>>> > >>>> On Fri, Aug 23, 2019 at 10:06 AM Dian Fu <dian0511...@gmail.com> > wrote: > >>>> > >>>>> Hi Jincheng, > >>>>> > >>>>> +1 to start the FLIP create and VOTE on this feature. I'm willing to > >> help > >>>>> on the FLIP create if you don't mind. As I haven't created a FLIP > >> before, > >>>>> it will be great if you could help on this. :) > >>>>> > >>>>> Regards, > >>>>> Dian > >>>>> > >>>>>> 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> 写道: > >>>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> Thanks a lot for your feedback. If there are no more suggestions and > >>>>>> comments, I think it's better to initiate a vote to create a FLIP > for > >>>>>> Apache Flink Python UDFs. > >>>>>> What do you think? > >>>>>> > >>>>>> Best, Jincheng > >>>>>> > >>>>>> jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道: > >>>>>> > >>>>>>> Hi Thomas, > >>>>>>> > >>>>>>> Thanks for your confirmation and the very important reminder about > >>>>> bundle > >>>>>>> processing. > >>>>>>> > >>>>>>> I have had add the description about how to perform bundle > processing > >>>>> from > >>>>>>> the perspective of checkpoint and watermark. Feel free to leave > >>>>> comments if > >>>>>>> there are anything not describe clearly. > >>>>>>> > >>>>>>> Best, > >>>>>>> Jincheng > >>>>>>> > >>>>>>> > >>>>>>> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道: > >>>>>>> > >>>>>>>> Hi Thomas, > >>>>>>>> > >>>>>>>> Thanks a lot the suggestions. > >>>>>>>> > >>>>>>>> Regarding to bundle processing, there is a section "Checkpoint"[1] > >> in > >>>>> the > >>>>>>>> design doc which talks about how to handle the checkpoint. > >>>>>>>> However, I think you are right that we should talk more about it, > >>>> such > >>>>> as > >>>>>>>> what's bundle processing, how it affects the checkpoint and > >>>> watermark, > >>>>> how > >>>>>>>> to handle the checkpoint and watermark, etc. > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>> > >>>> > >> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > >>>>>>>> < > >>>>>>>> > >>>>> > >>>> > >> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > >>>>>>>>> > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Dian > >>>>>>>> > >>>>>>>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道: > >>>>>>>>> > >>>>>>>>> Hi Jincheng, > >>>>>>>>> > >>>>>>>>> Thanks for putting this together. The proposal is very detailed, > >>>>>>>> thorough > >>>>>>>>> and for me as a Beam Flink runner contributor easy to understand > :) > >>>>>>>>> > >>>>>>>>> One thing that you should probably detail more is the bundle > >>>>>>>> processing. It > >>>>>>>>> is critically important for performance that multiple elements > are > >>>>>>>>> processed in a bundle. The default bundle size in the Flink > runner > >>>> is > >>>>>>>> 1s or > >>>>>>>>> 1000 elements, whichever comes first. And for streaming, you can > >>>> find > >>>>>>>> the > >>>>>>>>> logic necessary to align the bundle processing with watermarks > and > >>>>>>>>> checkpointing here: > >>>>>>>>> > >>>>>>>> > >>>>> > >>>> > >> > https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java > >>>>>>>>> > >>>>>>>>> Thomas > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun < > >>>>> sunjincheng...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi all, > >>>>>>>>>> > >>>>>>>>>> The Python Table API(without Python UDF support) has already > been > >>>>>>>> supported > >>>>>>>>>> and will be available in the coming release 1.9. > >>>>>>>>>> As Python UDF is very important for Python users, we'd like to > >>>> start > >>>>>>>> the > >>>>>>>>>> discussion about the Python UDF support in the Python Table API. > >>>>>>>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and have > >>>>>>>> drafted a > >>>>>>>>>> design doc[1]. It includes the following items: > >>>>>>>>>> > >>>>>>>>>> - The user-defined function interfaces. > >>>>>>>>>> - The user-defined function execution architecture. > >>>>>>>>>> > >>>>>>>>>> As mentioned by many guys in the previous discussion thread[2], > a > >>>>>>>>>> portability framework was introduced in Apache Beam in latest > >>>>>>>> releases. It > >>>>>>>>>> provides well-defined, language-neutral data structures and > >>>> protocols > >>>>>>>> for > >>>>>>>>>> language-neutral user-defined function execution. This design is > >>>>> based > >>>>>>>> on > >>>>>>>>>> Beam's portability framework. We will introduce how to make use > of > >>>>>>>> Beam's > >>>>>>>>>> portability framework for user-defined function execution: data > >>>>>>>>>> transmission, state access, checkpoint, metrics, logging, etc. > >>>>>>>>>> > >>>>>>>>>> Considering that the design relies on Beam's portability > framework > >>>>> for > >>>>>>>>>> Python user-defined function execution and not all the > >> contributors > >>>>> in > >>>>>>>>>> Flink community are familiar with Beam's portability framework, > we > >>>>> have > >>>>>>>>>> done a prototype[3] for proof of concept and also ease of > >>>>>>>> understanding of > >>>>>>>>>> the design. > >>>>>>>>>> > >>>>>>>>>> Welcome any feedback. > >>>>>>>>>> > >>>>>>>>>> Best, > >>>>>>>>>> Jincheng > >>>>>>>>>> > >>>>>>>>>> [1] > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>> > >>>> > >> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing > >>>>>>>>>> [2] > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>> > >>>>> > >>>> > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html > >>>>>>>>>> [3] https://github.com/dianfu/flink/commits/udf_poc > >>>>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>> > >>>>> > >>>> > >> > >> > >