+1 for starting the vote. Thanks Jincheng a lot for the discussion.
Best, Hequn On Fri, Aug 23, 2019 at 10:06 AM Dian Fu <dian0511...@gmail.com> wrote: > Hi Jincheng, > > +1 to start the FLIP create and VOTE on this feature. I'm willing to help > on the FLIP create if you don't mind. As I haven't created a FLIP before, > it will be great if you could help on this. :) > > Regards, > Dian > > > 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> 写道: > > > > Hi all, > > > > Thanks a lot for your feedback. If there are no more suggestions and > > comments, I think it's better to initiate a vote to create a FLIP for > > Apache Flink Python UDFs. > > What do you think? > > > > Best, Jincheng > > > > jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道: > > > >> Hi Thomas, > >> > >> Thanks for your confirmation and the very important reminder about > bundle > >> processing. > >> > >> I have had add the description about how to perform bundle processing > from > >> the perspective of checkpoint and watermark. Feel free to leave > comments if > >> there are anything not describe clearly. > >> > >> Best, > >> Jincheng > >> > >> > >> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道: > >> > >>> Hi Thomas, > >>> > >>> Thanks a lot the suggestions. > >>> > >>> Regarding to bundle processing, there is a section "Checkpoint"[1] in > the > >>> design doc which talks about how to handle the checkpoint. > >>> However, I think you are right that we should talk more about it, such > as > >>> what's bundle processing, how it affects the checkpoint and watermark, > how > >>> to handle the checkpoint and watermark, etc. > >>> > >>> [1] > >>> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > >>> < > >>> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3 > >>>> > >>> > >>> Regards, > >>> Dian > >>> > >>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道: > >>>> > >>>> Hi Jincheng, > >>>> > >>>> Thanks for putting this together. The proposal is very detailed, > >>> thorough > >>>> and for me as a Beam Flink runner contributor easy to understand :) > >>>> > >>>> One thing that you should probably detail more is the bundle > >>> processing. It > >>>> is critically important for performance that multiple elements are > >>>> processed in a bundle. The default bundle size in the Flink runner is > >>> 1s or > >>>> 1000 elements, whichever comes first. And for streaming, you can find > >>> the > >>>> logic necessary to align the bundle processing with watermarks and > >>>> checkpointing here: > >>>> > >>> > https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java > >>>> > >>>> Thomas > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun < > sunjincheng...@gmail.com> > >>>> wrote: > >>>> > >>>>> Hi all, > >>>>> > >>>>> The Python Table API(without Python UDF support) has already been > >>> supported > >>>>> and will be available in the coming release 1.9. > >>>>> As Python UDF is very important for Python users, we'd like to start > >>> the > >>>>> discussion about the Python UDF support in the Python Table API. > >>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and have > >>> drafted a > >>>>> design doc[1]. It includes the following items: > >>>>> > >>>>> - The user-defined function interfaces. > >>>>> - The user-defined function execution architecture. > >>>>> > >>>>> As mentioned by many guys in the previous discussion thread[2], a > >>>>> portability framework was introduced in Apache Beam in latest > >>> releases. It > >>>>> provides well-defined, language-neutral data structures and protocols > >>> for > >>>>> language-neutral user-defined function execution. This design is > based > >>> on > >>>>> Beam's portability framework. We will introduce how to make use of > >>> Beam's > >>>>> portability framework for user-defined function execution: data > >>>>> transmission, state access, checkpoint, metrics, logging, etc. > >>>>> > >>>>> Considering that the design relies on Beam's portability framework > for > >>>>> Python user-defined function execution and not all the contributors > in > >>>>> Flink community are familiar with Beam's portability framework, we > have > >>>>> done a prototype[3] for proof of concept and also ease of > >>> understanding of > >>>>> the design. > >>>>> > >>>>> Welcome any feedback. > >>>>> > >>>>> Best, > >>>>> Jincheng > >>>>> > >>>>> [1] > >>>>> > >>>>> > >>> > https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing > >>>>> [2] > >>>>> > >>>>> > >>> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html > >>>>> [3] https://github.com/dianfu/flink/commits/udf_poc > >>>>> > >>> > >>> > >