Hi everyone,

Thanks all of you for the discussion.
If there are no objections, I would like to start a vote thread tomorrow.

Best,
Xingbo

Dian Fu <dian0511...@gmail.com> 于2020年9月3日周四 下午5:45写道:

> Thanks for preparing the FLIP, xingbo!
>
> LGTM overall and looking forward to the voting!
>
> Regards,
> Dian
>
> > 在 2020年9月3日,下午5:22,jincheng sun <sunjincheng...@gmail.com> 写道:
> >
> > Thank you! looking forward to the voting :)
> >
> > Best,
> > Jincheng
> >
> >
> > Xingbo Huang <hxbks...@gmail.com> 于2020年9月3日周四 下午2:39写道:
> >
> >> Hi Jincheng,
> >>
> >> Yes, I agree that users can extend the class `AggregateFunction` if they
> >> want to define a Pandas UDAF by the way of custom classes. I have
> updated
> >> the part of the FLIP.
> >>
> >> Best,
> >> Xingbo
> >>
> >> jincheng sun <sunjincheng...@gmail.com> 于2020年9月3日周四 下午1:48写道:
> >>
> >>> Thanks for the update Xingbo!
> >>>
> >>> Pandas UDAF can reuse the `class aggregate function (user defined
> >>> function)` interface in FLIP-139, and the core logic of Pandas UDAF
> users
> >>> is written in the `accumulate` method. In this way, we can unify the
> >>> interface semantics of all UDAF.
> >>>
> >>> What do you think?
> >>>
> >>> Best,
> >>> Jincheng
> >>>
> >>>
> >>>
> >>> Xingbo Huang <hxbks...@gmail.com> 于2020年8月31日周一 下午6:06写道:
> >>>
> >>>> Hi Jincheng,
> >>>>
> >>>> Thanks a lot for joining the discussion and the suggestion of
> >> discussing
> >>>> FLIP-137 and FLIP-139 together.
> >>>>
> >>>>>> 1. We also need to consider how pandas UDAF supports metrics, and
> >>>> whether
> >>>> we need a custom interface for pandas UDAF?
> >>>>
> >>>> Yes. We need to add an interface so that users can add some logic in
> >> the
> >>>> `open` or `close` method such as creating metrics. I have added the
> >>>> definition of the interface and the corresponding example in the doc.
> >>>>
> >>>>>> 2. We have added @udaf(), so whether to use ordinary Python UDAF?
> >>>>
> >>>> Yes. From the overall view of Python User Defined Function, we use
> @udf
> >>> to
> >>>> describe general python udf and pandas udf, @udtf to describe python
> >>> udtf,
> >>>> and @udaf to describe general python udaf and pandas udaf, which is
> >> more
> >>>> unified. I will discuss it in FLIP-139 later.
> >>>>
> >>>> Best,
> >>>> Xingbo
> >>>>
> >>>> jincheng sun <sunjincheng...@gmail.com> 于2020年8月31日周一 上午11:05写道:
> >>>>
> >>>>> Hi Xingbo,
> >>>>>
> >>>>> Thanks for the discussion! Overall, + 1 for this FLIP.
> >>>>> I have two points to add:
> >>>>>
> >>>>> - We also need to consider how pandas UDAF supports metrics, and
> >>> whether
> >>>>> we need a custom interface for pandas UDAF?
> >>>>> - We have added @udaf(), so whether to use ordinary Python UDAF? If
> >>> not,
> >>>>> the addition of @udaf is not appropriate. We need to discuss it
> >>> further.
> >>>>>
> >>>>> We can consider it combination with FLIP-139 for design. What do you
> >>>> think?
> >>>>>
> >>>>> Best,
> >>>>> Jincheng
> >>>>>
> >>>>>
> >>>>> Xingbo Huang <hxbks...@gmail.com> 于2020年8月24日周一 下午2:25写道:
> >>>>>
> >>>>>> Hi everyone,
> >>>>>>
> >>>>>> I would like to start a discussion thread on "Support Pandas UDAF
> >> in
> >>>>>> PyFlink"
> >>>>>>
> >>>>>> Pandas UDF has been supported in FLINK 1.11 (FLIP-97[1]). It solves
> >>> the
> >>>>>> high serialization/deserialization overhead in Python UDF and makes
> >>> it
> >>>>>> convenient to leverage the popular Python libraries such as Pandas,
> >>>>> Numpy,
> >>>>>> etc. Since Pandas UDF has so many advantages, we want to support
> >>> Pandas
> >>>>>> UDAF to extend usage of Pandas UDF.
> >>>>>>
> >>>>>> Dian Fu and I have discussed offline and have drafted the
> >>> FLIP-137[2].
> >>>> It
> >>>>>> includes the following items:
> >>>>>>  - Support Pandas UDAF in Batch Group Aggregation
> >>>>>>  - Support Pandas UDAF in Batch Group Window Aggregation
> >>>>>>  - Support Pandas UDAF in Batch Over Window Aggregation
> >>>>>>  - Support Pandas UDAF in Stream Group Window Aggregation
> >>>>>>  - Support Pandas UDAF in Stream Bounded Over Window Aggregation
> >>>>>>
> >>>>>>
> >>>>>> Looking forward to your feedback!
> >>>>>>
> >>>>>> Best,
> >>>>>> Xingbo
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-97%3A+Support+Scalar+Vectorized+Python+UDF+in+PyFlink
> >>>>>> [2]
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-137%3A+Support+Pandas+UDAF+in+PyFlink
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
>
>

Reply via email to