Re: Support local aggregate push down for Blink batch planner

Kurt Young Mon, 04 Jan 2021 17:52:06 -0800

Local aggregation is more like a physical operator rather than logical
operator. I would suggest going with idea #1.


Best,
Kurt


On Wed, Dec 30, 2020 at 8:31 PM Sebastian Liu <[email protected]> wrote:

> Hi Jark, Thx a lot for your quick reply and valuable suggestions.
> For (1): Agree: Since we are in the period of upgrading the new table
> source api,
> we really should consider the new interface for the new optimize rule. If
> the new rule
> doesn't use the new api, we'll have to upgrade it sooner or later. I have
> change to use
> the ability interface for the SupportsAggregatePushDown definition in above
> proposal.
>
> For (2): Agree: Change to use CallExpression is a better choice, and have
> resolved this
> comment in the proposal.
>
> For (3): I suggest we first support the JDBC connector, as we don't have
> Druid connector
> and ES connector just has sink api at present.
>
> But perhaps the biggest question may be whether we should use idea 1 or
> idea 2 in proposal.
>
> What do you think?  After we reach the agreement on the proposal, our team
> can drive to
> complete this feature.
>
> Jark Wu <[email protected]> 于2020年12月29日周二 下午2:58写道：
>
> > Hi Sebastian,
> >
> > Thanks for the proposal. I think this is a great improvement for Flink
> SQL.
> > I went through the design doc and have following thoughts:
> >
> > 1) Flink has deprecated the legacy TableSource in 1.11 and proposed a new
> >  set of DynamicTableSource interfaces. Could you update your proposal to
> > use the new interfaces?
> >  Follow the existing ability interfaces, e.g.
> > SupportsFilterPushDown, SupportsProjectionPushDown.
> >
> > 2) Personally, I think CallExpression would be a better representation
> than
> > separate `FunctionDefinition` and args. Because, it would be easier to
> know
> > what's the index and type of the arguments.
> >
> > 3) It would be better to list which connectors will be supported in the
> > plan?
> >
> > Best,
> > Jark
> >
> >
> > On Tue, 29 Dec 2020 at 00:49, Sebastian Liu <[email protected]>
> wrote:
> >
> > > Hi all,
> > >
> > > I'd like to discuss a new feature for the Blink Planner.
> > > Aggregate operator of Flink SQL is currently fully done at Flink layer.
> > > With the developing of storage, many downstream storage of Flink SQL
> has
> > > the ability to deal with Aggregation operator.
> > > Pushing down Aggregate to data source layer will improve performance
> from
> > > the perspective of the network IO and computation overhead.
> > >
> > > I have drafted a design doc for this new feature.
> > >
> > >
> >
> https://docs.google.com/document/d/1kGwC_h4qBNxF2eMEz6T6arByOB8yilrPLqDN0QBQXW4/edit?usp=sharing
> > >
> > > Any comment or discussion is welcome.
> > >
> > > --
> > >
> > > *With kind regards
> > > ------------------------------------------------------------
> > > Sebastian Liu 刘洋
> > > Institute of Computing Technology, Chinese Academy of Science
> > > Mobile\WeChat: +86—15201613655
> > > E-mail: [email protected] <[email protected]>
> > > QQ: 3239559*
> > >
> >
>
>
> --
>
> *With kind regards
> ------------------------------------------------------------
> Sebastian Liu 刘洋
> Institute of Computing Technology, Chinese Academy of Science
> Mobile\WeChat: +86—15201613655
> E-mail: [email protected] <[email protected]>
> QQ: 3239559*
>

Re: Support local aggregate push down for Blink batch planner

Reply via email to