Support distinct aggregation over data stream on Table/SQL API

Rong Rong Tue, 13 Feb 2018 18:08:20 -0800

Hi Community,

We are working on support of distinct aggregators over data stream on
Table/SQL API. Currently there are seems to be many JIRAs related to
distinct agg over stream use cases which are still pending (FLINK-6249
<https://issues.apache.org/jira/browse/FLINK-6249>, FLINK-6260
<https://issues.apache.org/jira/browse/FLINK-6260>, FLINK-5315
<https://issues.apache.org/jira/browse/FLINK-5315>, FLINK-6335
<https://issues.apache.org/jira/browse/FLINK-6335>, FLINK-6373
<https://issues.apache.org/jira/browse/FLINK-6373>, FLINK-6250
<https://issues.apache.org/jira/browse/FLINK-6250>, etc) and I am having
some concerns when trying to come up with a solution as there might be
other use cases out there.


I summarized a write up and categorized the use cases into unbounded or
bounded aggregations and proposed a solution through modifying and adding
new distinct aggregate functions using UDAGG API with DataView. Please find
it here
<https://docs.google.com/document/d/1zj6OA-K2hi7ah8Fo-xTQB-mVmYfm6LsN2_NHgTCVmJI/edit?usp=sharing>
.

Any comments or suggestions are highly appreciated.

Many Thanks,
Rong

Support distinct aggregation over data stream on Table/SQL API

Reply via email to