Daniel Becker has uploaded this change for review. ( http://gerrit.cloudera.org:8080/22746
Change subject: IMPALA-13873: Missing equivalence conjunct in aggregation node with inline views ...................................................................... IMPALA-13873: Missing equivalence conjunct in aggregation node with inline views Some queries involving plain (distinct) UNIONs miss conjuncts, leading to incorrect results: Example: WITH u1 AS (select 10 a, 10 b), t AS (select a, b, min(b) over (partition by a) min_b from u1 UNION select 10, 10, 20) select t.* from t where t.b = t.min_b; Expected result: +----+----+-------+ | a | b | min_b | +----+----+-------+ | 10 | 10 | 10 | +----+----+-------+ Actual result: +----+----+-------+ | a | b | min_b | +----+----+-------+ | 10 | 10 | 10 | | 10 | 20 | 10 | +----+----+-------+ This is caused by MultiAggregateInfo assuming that conjuncts bound by grouping slots that are produced by SlotRef grouping expressions are already evaluated below the AggregationNode. However, this is not true in all cases: with UNIONs, there may be conjuncts that are unassigned below the AggregationNode. This may happen if a conjunct cannot be pushed into all operands of a UNION, because the source tuples in the operands do not contain all of the slots referenced by the predicate. In the example above, it happens in the operand with the analytic function: the source tuple, 'u1', does not contain a slot corresponding to 'min(b)'. In these cases, the conjuncts need to be evaluated in the AggregationNode (possibly in addition to some of the UNION operands). This change fixes this problem by introducing a flag in MultiAggregateInfo, 'keepConjuncts_', which is set during the planning of the UNION if there are unassigned conjuncts remaining. If this flag is set, MultiAggregateInfo will not assume that its conjuncts are redundant and will evaluate them. Testing: - Added a PlannerTest and an EE test for the case where a conjunct was previously incorrectly removed from the AggregationNode. - Existing tests cover the case when conjuncts can be safely removed from an AggregationNode above a UnionNode because the conjuncts are pushed into all union operands, see for example https://github.com/apache/impala/blob/6f2d9a24d8c014a7dc1ec7a08bcfb025b3bdf41f/testdata/workloads/functional-planner/queries/PlannerTest/union.test#L3914 Change-Id: I67a59cd96d83181ce249fd6ca141906f549a09b3 --- M fe/src/main/java/org/apache/impala/analysis/MultiAggregateInfo.java M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java M testdata/workloads/functional-planner/queries/PlannerTest/union.test M testdata/workloads/functional-query/queries/QueryTest/aggregation.test 4 files changed, 93 insertions(+), 11 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/22746/4 -- To view, visit http://gerrit.cloudera.org:8080/22746 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I67a59cd96d83181ce249fd6ca141906f549a09b3 Gerrit-Change-Number: 22746 Gerrit-PatchSet: 4 Gerrit-Owner: Daniel Becker <daniel.bec...@cloudera.com> Gerrit-Reviewer: Noemi Pap-Takacs <npaptak...@cloudera.com> Gerrit-Reviewer: Peter Rozsa <pro...@cloudera.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>