Hi Jincheng,

Thanks for raising the discussion!
The key information is very important for query optimizations. It would be
nice if we can use upsert mode to achieve better performance.

+1 for the `withKeys` proposal. :)

Best, Hequn


On Wed, Jun 26, 2019 at 4:37 PM jincheng sun <sunjincheng...@gmail.com>
wrote:

> Hi all,
>
> With the continuous efforts from the community, we already supported
> `flatAggregate`[1] on TableAPI in retract mode. I think It's better to add
> upsert mode for  `flatAggregate`.
>
> The result table of streaming non-window `flatAggregate` is a table
> contains updates. We can, of course, use a RetractStreamTableSink[2] to
> emit the table, but we can get better performance in upsert mode.  However,
> due to the lack of keys, we can’t use an UpsertStreamTableSink to emit the
> table. We don’t have this problem for a normal aggregate as it emits a
> single row for each group, so the unique keys are exactly the same with the
> group keys. While for a `flatAggregate`, its pretty difference that due to
> emits multi rows(a “sub-table”) for a single group. To solve this problem,
> we need to find a way to define keys on flatAggregate, so that we can also
> use upsert sink to emit the result table after flatAggregate.
>
> So, Aljoscha, Hequn and I prepared a design document for how to define the
> update keys for  `flatAggregate` in upsert mode.  The detail can be found
> here:
>
>
> https://docs.google.com/document/d/183qHo8PDG-xserDi_AbGP6YX9aPY0rVr80p3O3Gyz6U/edit?usp=sharing
>
> I appreciate it if you can have look at the document and any comments are
> welcome!
>
>
> Best,
>
> Jincheng
>
>
> [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739
>
> [2]
>
> https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sourceSinks.html#defining-a-streamtablesource
>

Reply via email to