Hi Jincheng,

+1 to start the FLIP create and VOTE on this feature. I'm willing to help on 
the FLIP create if you don't mind. As I haven't created a FLIP before, it will 
be great if you could help on this. :)

Regards,
Dian

> 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> 写道:
> 
> Hi all,
> 
> Thanks a lot for your feedback. If there are no more suggestions and
> comments, I think it's better to  initiate a vote to create a FLIP for
> Apache Flink Python UDFs.
> What do you think?
> 
> Best, Jincheng
> 
> jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道:
> 
>> Hi Thomas,
>> 
>> Thanks for your confirmation and the very important reminder about bundle
>> processing.
>> 
>> I have had add the description about how to perform bundle processing from
>> the perspective of checkpoint and watermark. Feel free to leave comments if
>> there are anything not describe clearly.
>> 
>> Best,
>> Jincheng
>> 
>> 
>> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道:
>> 
>>> Hi Thomas,
>>> 
>>> Thanks a lot the suggestions.
>>> 
>>> Regarding to bundle processing, there is a section "Checkpoint"[1] in the
>>> design doc which talks about how to handle the checkpoint.
>>> However, I think you are right that we should talk more about it, such as
>>> what's bundle processing, how it affects the checkpoint and watermark, how
>>> to handle the checkpoint and watermark, etc.
>>> 
>>> [1]
>>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
>>> <
>>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
>>>> 
>>> 
>>> Regards,
>>> Dian
>>> 
>>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道:
>>>> 
>>>> Hi Jincheng,
>>>> 
>>>> Thanks for putting this together. The proposal is very detailed,
>>> thorough
>>>> and for me as a Beam Flink runner contributor easy to understand :)
>>>> 
>>>> One thing that you should probably detail more is the bundle
>>> processing. It
>>>> is critically important for performance that multiple elements are
>>>> processed in a bundle. The default bundle size in the Flink runner is
>>> 1s or
>>>> 1000 elements, whichever comes first. And for streaming, you can find
>>> the
>>>> logic necessary to align the bundle processing with watermarks and
>>>> checkpointing here:
>>>> 
>>> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
>>>> 
>>>> Thomas
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <sunjincheng...@gmail.com>
>>>> wrote:
>>>> 
>>>>> Hi all,
>>>>> 
>>>>> The Python Table API(without Python UDF support) has already been
>>> supported
>>>>> and will be available in the coming release 1.9.
>>>>> As Python UDF is very important for Python users, we'd like to start
>>> the
>>>>> discussion about the Python UDF support in the Python Table API.
>>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and have
>>> drafted a
>>>>> design doc[1]. It includes the following items:
>>>>> 
>>>>> - The user-defined function interfaces.
>>>>> - The user-defined function execution architecture.
>>>>> 
>>>>> As mentioned by many guys in the previous discussion thread[2], a
>>>>> portability framework was introduced in Apache Beam in latest
>>> releases. It
>>>>> provides well-defined, language-neutral data structures and protocols
>>> for
>>>>> language-neutral user-defined function execution. This design is based
>>> on
>>>>> Beam's portability framework. We will introduce how to make use of
>>> Beam's
>>>>> portability framework for user-defined function execution: data
>>>>> transmission, state access, checkpoint, metrics, logging, etc.
>>>>> 
>>>>> Considering that the design relies on Beam's portability framework for
>>>>> Python user-defined function execution and not all the contributors in
>>>>> Flink community are familiar with Beam's portability framework, we have
>>>>> done a prototype[3] for proof of concept and also ease of
>>> understanding of
>>>>> the design.
>>>>> 
>>>>> Welcome any feedback.
>>>>> 
>>>>> Best,
>>>>> Jincheng
>>>>> 
>>>>> [1]
>>>>> 
>>>>> 
>>> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing
>>>>> [2]
>>>>> 
>>>>> 
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html
>>>>> [3] https://github.com/dianfu/flink/commits/udf_poc
>>>>> 
>>> 
>>> 

Reply via email to