Jark Wu created FLINK-12263: ------------------------------- Summary: Remove SINGLE_VALUE aggregate function from physical plan Key: FLINK-12263 URL: https://issues.apache.org/jira/browse/FLINK-12263 Project: Flink Issue Type: New Feature Components: Table SQL / Planner Reporter: Jark Wu
SINGLE_VALUE is an aggregate function which only accepts one row, and throws exception when received more than one row. For example: {code:sql} SELECT a2, SUM(a1) FROM A GROUP BY a2 HAVING SUM(a1) > (SELECT SUM(a1) * 0.1 FROM A) {code} will get a physical plan contains SINGLE_VALUE: {code:sql} +- NestedLoopJoin(joinType=[InnerJoin], where=[>(EXPR$1, $f0)], select=[a2, EXPR$1, $f0], build=[right], singleRowJoin=[true]) :- HashAggregate(isMerge=[true], groupBy=[a2], select=[a2, Final_SUM(sum$0) AS EXPR$1]) : +- Exchange(distribution=[hash[a2]]) : +- LocalHashAggregate(groupBy=[a2], select=[a2, Partial_SUM(a1) AS sum$0]) : +- TableSourceScan(table=[[A, source: [TestTableSource(a1, a2)]]], fields=[a1, a2]) +- Exchange(distribution=[broadcast]) +- HashAggregate(isMerge=[true], select=[Final_SINGLE_VALUE(value$0, count$1) AS $f0]) +- Exchange(distribution=[single]) +- LocalHashAggregate(select=[Partial_SINGLE_VALUE(EXPR$0) AS (value$0, count$1)]) +- Calc(select=[*($f0, 0.1) AS EXPR$0]) +- HashAggregate(isMerge=[true], select=[Final_SUM(sum$0) AS $f0]) +- Exchange(distribution=[single]) +- LocalHashAggregate(select=[Partial_SUM(a1) AS sum$0]) +- Calc(select=[a1]) +- TableSourceScan(table=[[A, source: [TestTableSource(a1, a2)]]], fields=[a1, a2]) {code} But SINGLE_VALUE is a bit wired in physical plan because the logical plan can make sure there is only one input row. Moreover it it also introduces additional overhead. -- This message was sent by Atlassian JIRA (v7.6.3#76005)