github-actions[bot] commented on code in PR #63690:
URL: https://github.com/apache/doris/pull/63690#discussion_r3445283936
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java:
##########
@@ -255,18 +246,16 @@ public Plan visitLogicalAggregate(LogicalAggregate<?
extends Plan> agg, JobConte
Map<Expression, Expression> replaceMap = new
HashMap<>();
List<AggregateFunction> relatedAggFunc =
aggFunctionsForOutputExpressions.get(ne);
for (AggregateFunction func : relatedAggFunc) {
- Slot pushedDownSlot =
pushDownContext.getAliasMap().get(func).toSlot();
+ Alias pushedAlias =
pushDownContext.getAliasMap().get(func);
+ ExprId pushId = pushedAlias.getExprId();
+ if (!state.hasAggFuncOutput(pushId)) {
+ continue;
+ }
+ Expression value =
state.getPushedAggFuncSlot(pushId);
if (func instanceof Count) {
- // For count(A), after pushdown we have
count(A) as x,
- // and the top agg should use sum(x) instead
of count(x).
- // Wrap with ifnull(..., 0) because COUNT
never returns NULL,
- // but after pushdown across an outer join,
the intermediate count
- // slot can be NULL (null-extended), making
sum(NULL) = NULL.
- Function rollUpFunc = ((RollUpTrait)
func).constructRollUp(pushedDownSlot);
- replaceMap.put(func, new Nvl(rollUpFunc, new
BigIntLiteral(0)));
+ replaceMap.put(func, new Sum0(value));
Review Comment:
This replacement map is keyed by the aggregate child expression, so a single
output expression that contains multiple aggregates over the same slot, or also
references the grouped slot, can be rewritten to the wrong pushed value.
Reduced tree:
```text
Aggregate(output: (sum(l.v) + max(l.v)) AS x, group by l.k)
Join(l.k = r.k)
Scan l(k, v)
Scan r(k)
```
After pushdown, `state` has `sum(l.v) -> sum_v` and `max(l.v) -> max_v`.
This loop first puts `l.v -> sum_v`, then overwrites the same key with `l.v ->
max_v`. `ExpressionUtils.replace()` walks the whole output expression top-down,
so the parent becomes `sum(max_v) + max(max_v)` instead of `sum(sum_v) +
max(max_v)`, which changes results whenever a key has different `l.v` values.
The same shape appears with `sum(l.v) + l.v` when `l.v` is also a group key.
Please avoid using one shared child-expression map for non-decomposed
aggregates. Replace the aggregate function node itself for normal SUM/MIN/MAX
rollups, and keep the decomposed-IF path from colliding with other occurrences
of the same child expression.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]