[jira] [Commented] (FLINK-3475) DISTINCT aggregate function support

Fabian Hueske (JIRA) Fri, 20 May 2016 02:17:12 -0700

    [ 
https://issues.apache.org/jira/browse/FLINK-3475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15293052#comment-15293052
 ]


Fabian Hueske commented on FLINK-3475:
--------------------------------------

DISTINCT aggregates can be computed by sorting the reduce group on the distinct 
attribute (secondary sort) and not considering duplicate values. A first step 
would be to add support for a single distinct attribute (groups can only be 
primarily sorted on one attribute). 

In case of multiple distinct aggregates, we have to split the aggregation into 
several group reduce operators and join the result afterwards. The join can be 
done locally and in a streamed merge join (partitioning and sorting will be 
preserved).

> DISTINCT aggregate function support
> -----------------------------------
>
>                 Key: FLINK-3475
>                 URL: https://issues.apache.org/jira/browse/FLINK-3475
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table API
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>
> DISTINCT aggregate function may be able to reuse the aggregate function 
> instead of separate implementation, and let Flink runtime take care of 
> duplicate records.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLINK-3475) DISTINCT aggregate function support

Reply via email to