Hi Jincheng, Thanks for raising the discussion! The key information is very important for query optimizations. It would be nice if we can use upsert mode to achieve better performance.
+1 for the `withKeys` proposal. :) Best, Hequn On Wed, Jun 26, 2019 at 4:37 PM jincheng sun <sunjincheng...@gmail.com> wrote: > Hi all, > > With the continuous efforts from the community, we already supported > `flatAggregate`[1] on TableAPI in retract mode. I think It's better to add > upsert mode for `flatAggregate`. > > The result table of streaming non-window `flatAggregate` is a table > contains updates. We can, of course, use a RetractStreamTableSink[2] to > emit the table, but we can get better performance in upsert mode. However, > due to the lack of keys, we can’t use an UpsertStreamTableSink to emit the > table. We don’t have this problem for a normal aggregate as it emits a > single row for each group, so the unique keys are exactly the same with the > group keys. While for a `flatAggregate`, its pretty difference that due to > emits multi rows(a “sub-table”) for a single group. To solve this problem, > we need to find a way to define keys on flatAggregate, so that we can also > use upsert sink to emit the result table after flatAggregate. > > So, Aljoscha, Hequn and I prepared a design document for how to define the > update keys for `flatAggregate` in upsert mode. The detail can be found > here: > > > https://docs.google.com/document/d/183qHo8PDG-xserDi_AbGP6YX9aPY0rVr80p3O3Gyz6U/edit?usp=sharing > > I appreciate it if you can have look at the document and any comments are > welcome! > > > Best, > > Jincheng > > > [1] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739 > > [2] > > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sourceSinks.html#defining-a-streamtablesource >