Hi Jincheng, +1 to start the FLIP create and VOTE on this feature. I'm willing to help on the FLIP create if you don't mind. As I haven't created a FLIP before, it will be great if you could help on this. :)
Regards, Dian > 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> 写道: > > Hi all, > > Thanks a lot for your feedback. If there are no more suggestions and > comments, I think it's better to initiate a vote to create a FLIP for > Apache Flink Python UDFs. > What do you think? > > Best, Jincheng > > jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道: > >> Hi Thomas, >> >> Thanks for your confirmation and the very important reminder about bundle >> processing. >> >> I have had add the description about how to perform bundle processing from >> the perspective of checkpoint and watermark. Feel free to leave comments if >> there are anything not describe clearly. >> >> Best, >> Jincheng >> >> >> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道: >> >>> Hi Thomas, >>> >>> Thanks a lot the suggestions. >>> >>> Regarding to bundle processing, there is a section "Checkpoint"[1] in the >>> design doc which talks about how to handle the checkpoint. >>> However, I think you are right that we should talk more about it, such as >>> what's bundle processing, how it affects the checkpoint and watermark, how >>> to handle the checkpoint and watermark, etc. >>> >>> [1] >>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 >>> < >>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 >>>> >>> >>> Regards, >>> Dian >>> >>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道: >>>> >>>> Hi Jincheng, >>>> >>>> Thanks for putting this together. The proposal is very detailed, >>> thorough >>>> and for me as a Beam Flink runner contributor easy to understand :) >>>> >>>> One thing that you should probably detail more is the bundle >>> processing. It >>>> is critically important for performance that multiple elements are >>>> processed in a bundle. The default bundle size in the Flink runner is >>> 1s or >>>> 1000 elements, whichever comes first. And for streaming, you can find >>> the >>>> logic necessary to align the bundle processing with watermarks and >>>> checkpointing here: >>>> >>> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java >>>> >>>> Thomas >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <sunjincheng...@gmail.com> >>>> wrote: >>>> >>>>> Hi all, >>>>> >>>>> The Python Table API(without Python UDF support) has already been >>> supported >>>>> and will be available in the coming release 1.9. >>>>> As Python UDF is very important for Python users, we'd like to start >>> the >>>>> discussion about the Python UDF support in the Python Table API. >>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and have >>> drafted a >>>>> design doc[1]. It includes the following items: >>>>> >>>>> - The user-defined function interfaces. >>>>> - The user-defined function execution architecture. >>>>> >>>>> As mentioned by many guys in the previous discussion thread[2], a >>>>> portability framework was introduced in Apache Beam in latest >>> releases. It >>>>> provides well-defined, language-neutral data structures and protocols >>> for >>>>> language-neutral user-defined function execution. This design is based >>> on >>>>> Beam's portability framework. We will introduce how to make use of >>> Beam's >>>>> portability framework for user-defined function execution: data >>>>> transmission, state access, checkpoint, metrics, logging, etc. >>>>> >>>>> Considering that the design relies on Beam's portability framework for >>>>> Python user-defined function execution and not all the contributors in >>>>> Flink community are familiar with Beam's portability framework, we have >>>>> done a prototype[3] for proof of concept and also ease of >>> understanding of >>>>> the design. >>>>> >>>>> Welcome any feedback. >>>>> >>>>> Best, >>>>> Jincheng >>>>> >>>>> [1] >>>>> >>>>> >>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing >>>>> [2] >>>>> >>>>> >>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html >>>>> [3] https://github.com/dianfu/flink/commits/udf_poc >>>>> >>> >>>