[ https://issues.apache.org/jira/browse/FLINK-8355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16318619#comment-16318619 ]
Fabian Hueske commented on FLINK-8355: -------------------------------------- The motivation for the {{DataSetAggregateWithNullValuesRule}} is to prevent incorrect aggreagtion results for empty tables. For instance the query {{SELECT COUNT(*) FROM mytable}} should return a row {{(0}} and not an empty result. Until now, the built-in aggregations were working correctly because they ignored {{null}} values. However, UDAGGs might compute incorrect results if they would not ignore {{null}} values. Hence, it definitely makes sense to remove the rule. A solution would be to add a {{MapPartitionFunction}} with parallelism 1 after a groupless aggregation. The {{MapPartitionFunction}} would simply forward all input data. If the input is empty, it emits a single result row with all aggregates at initialized state. > DataSet Should not union a NULL row for AGG without GROUP BY clause. > -------------------------------------------------------------------- > > Key: FLINK-8355 > URL: https://issues.apache.org/jira/browse/FLINK-8355 > Project: Flink > Issue Type: Bug > Components: Table API & SQL > Affects Versions: 1.5.0 > Reporter: sunjincheng > > Currently {{DataSetAggregateWithNullValuesRule}} will UINON a NULL row for > non grouped aggregate query. when {{CountAggFunction}} support > {{COUNT(*)}}(FLINK-8325). the result will incorrect. > for example, if Tabble {{T1}} has 3 records. when we run the follow SQL in > DataSet: > {code} > SELECT COUNT(*) as cnt from Tab // cnt = 4(incorrect). > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)