+1 for starting the vote.

Thanks Jincheng a lot for the discussion.

Best, Hequn

On Fri, Aug 23, 2019 at 10:06 AM Dian Fu <dian0511...@gmail.com> wrote:

> Hi Jincheng,
>
> +1 to start the FLIP create and VOTE on this feature. I'm willing to help
> on the FLIP create if you don't mind. As I haven't created a FLIP before,
> it will be great if you could help on this. :)
>
> Regards,
> Dian
>
> > 在 2019年8月22日,下午11:41,jincheng sun <sunjincheng...@gmail.com> 写道:
> >
> > Hi all,
> >
> > Thanks a lot for your feedback. If there are no more suggestions and
> > comments, I think it's better to  initiate a vote to create a FLIP for
> > Apache Flink Python UDFs.
> > What do you think?
> >
> > Best, Jincheng
> >
> > jincheng sun <sunjincheng...@gmail.com> 于2019年8月15日周四 上午12:54写道:
> >
> >> Hi Thomas,
> >>
> >> Thanks for your confirmation and the very important reminder about
> bundle
> >> processing.
> >>
> >> I have had add the description about how to perform bundle processing
> from
> >> the perspective of checkpoint and watermark. Feel free to leave
> comments if
> >> there are anything not describe clearly.
> >>
> >> Best,
> >> Jincheng
> >>
> >>
> >> Dian Fu <dian0511...@gmail.com> 于2019年8月14日周三 上午10:08写道:
> >>
> >>> Hi Thomas,
> >>>
> >>> Thanks a lot the suggestions.
> >>>
> >>> Regarding to bundle processing, there is a section "Checkpoint"[1] in
> the
> >>> design doc which talks about how to handle the checkpoint.
> >>> However, I think you are right that we should talk more about it, such
> as
> >>> what's bundle processing, how it affects the checkpoint and watermark,
> how
> >>> to handle the checkpoint and watermark, etc.
> >>>
> >>> [1]
> >>>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> >>> <
> >>>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit#heading=h.urladt565yo3
> >>>>
> >>>
> >>> Regards,
> >>> Dian
> >>>
> >>>> 在 2019年8月14日,上午1:01,Thomas Weise <t...@apache.org> 写道:
> >>>>
> >>>> Hi Jincheng,
> >>>>
> >>>> Thanks for putting this together. The proposal is very detailed,
> >>> thorough
> >>>> and for me as a Beam Flink runner contributor easy to understand :)
> >>>>
> >>>> One thing that you should probably detail more is the bundle
> >>> processing. It
> >>>> is critically important for performance that multiple elements are
> >>>> processed in a bundle. The default bundle size in the Flink runner is
> >>> 1s or
> >>>> 1000 elements, whichever comes first. And for streaming, you can find
> >>> the
> >>>> logic necessary to align the bundle processing with watermarks and
> >>>> checkpointing here:
> >>>>
> >>>
> https://github.com/apache/beam/blob/release-2.14.0/runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/ExecutableStageDoFnOperator.java
> >>>>
> >>>> Thomas
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Aug 13, 2019 at 7:05 AM jincheng sun <
> sunjincheng...@gmail.com>
> >>>> wrote:
> >>>>
> >>>>> Hi all,
> >>>>>
> >>>>> The Python Table API(without Python UDF support) has already been
> >>> supported
> >>>>> and will be available in the coming release 1.9.
> >>>>> As Python UDF is very important for Python users, we'd like to start
> >>> the
> >>>>> discussion about the Python UDF support in the Python Table API.
> >>>>> Aljoscha Krettek, Dian Fu and I have discussed offline and have
> >>> drafted a
> >>>>> design doc[1]. It includes the following items:
> >>>>>
> >>>>> - The user-defined function interfaces.
> >>>>> - The user-defined function execution architecture.
> >>>>>
> >>>>> As mentioned by many guys in the previous discussion thread[2], a
> >>>>> portability framework was introduced in Apache Beam in latest
> >>> releases. It
> >>>>> provides well-defined, language-neutral data structures and protocols
> >>> for
> >>>>> language-neutral user-defined function execution. This design is
> based
> >>> on
> >>>>> Beam's portability framework. We will introduce how to make use of
> >>> Beam's
> >>>>> portability framework for user-defined function execution: data
> >>>>> transmission, state access, checkpoint, metrics, logging, etc.
> >>>>>
> >>>>> Considering that the design relies on Beam's portability framework
> for
> >>>>> Python user-defined function execution and not all the contributors
> in
> >>>>> Flink community are familiar with Beam's portability framework, we
> have
> >>>>> done a prototype[3] for proof of concept and also ease of
> >>> understanding of
> >>>>> the design.
> >>>>>
> >>>>> Welcome any feedback.
> >>>>>
> >>>>> Best,
> >>>>> Jincheng
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> >>>
> https://docs.google.com/document/d/1WpTyCXAQh8Jr2yWfz7MWCD2-lou05QaQFb810ZvTefY/edit?usp=sharing
> >>>>> [2]
> >>>>>
> >>>>>
> >>>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-38-Support-python-language-in-flink-TableAPI-td28061.html
> >>>>> [3] https://github.com/dianfu/flink/commits/udf_poc
> >>>>>
> >>>
> >>>
>
>

Reply via email to