github-actions[bot] commented on code in PR #64820:
URL: https://github.com/apache/doris/pull/64820#discussion_r3472174620


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java:
##########
@@ -143,14 +141,16 @@ public Plan visitLogicalAggregate(LogicalAggregate<? 
extends Plan> agg, JobConte
                     if (aggFunction.containsVolatileExpression()) {
                         return agg;
                     }
-                    // CaseWhen and If (which CASE WHEN is normalized into) 
must both be checked.
-                    // When an agg function contains an If/CaseWhen whose 
condition tests IS NULL
-                    // (e.g. count(if(col IS NULL, value, NULL))), pushing it 
to the nullable side
-                    // of an outer join produces wrong results: null-extended 
rows make "col IS NULL"
-                    // TRUE at the top level, but the pre-aggregated count 
slot becomes NULL after
-                    // null-extension, and ifnull(sum(NULL), 0) = 0 instead of 
the correct 1.
-                    if (!hasCaseWhen && aggFunction.anyMatch(e -> e instanceof 
CaseWhen || e instanceof If)) {
-                        hasCaseWhen = true;
+                    // NullToNonNullFunction: expressions that can convert 
NULL input to non-NULL output
+                    // (e.g. COALESCE, NVL, IF, CASE WHEN, NULL_OR_EMPTY, 
NOT_NULL_OR_EMPTY).
+                    // When an agg function contains such an expression 
wrapping a column from the
+                    // nullable side of an outer join, null-extended rows 
would produce non-NULL values
+                    // that get counted by the aggregation. But the 
pre-aggregation on the base table
+                    // cannot see null-extended rows (they are produced by the 
join), so the push-down

Review Comment:
   The marker set is still incomplete for the same nullable-side outer-join 
bug. For example:
   
   ```text
   Aggregate(group by t1.id1, output=count(to_ipv4_or_default(t2.ip)))
     LeftOuterJoin(t1.id1 = t2.id2)
       t1
       t2
   ```
   
   `to_ipv4_or_default`, `to_ipv6_or_default`, `ipv4_string_to_num_or_default`, 
and `ipv6_string_to_num_or_default` all return default non-NULL values for 
NULL/bad input (`ipv4_string_to_num_or_default(NULL)` has existing expected 
output `0`, and `to_ipv4_or_default(NULL)`/`to_ipv6_or_default(NULL)` produce 
`0.0.0.0`/`::`). None of those classes implements `NullToNonNullFunction`, so 
`containsNullToNonNull` remains false here. `adjustPushSideForNullable` then 
allows the right-side push because the aggregate has a right-side input slot, 
but the lower aggregate over `t2` cannot see unmatched left rows. After null 
extension the rolled-up count contributes `sum0(NULL)=0`, while the original 
expression counts the default IP value and should contribute 1. Please include 
these null-to-default functions in the guard, or make the helper conservative 
enough to catch all functions that produce a non-NULL value from nullable-side 
NULL input.



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/eageraggregation/PushDownAggregation.java:
##########
@@ -18,11 +18,10 @@
 package org.apache.doris.nereids.rules.rewrite.eageraggregation;
 
 import org.apache.doris.nereids.jobs.JobContext;
-import org.apache.doris.nereids.rules.analysis.NormalizeAggregate;
 import org.apache.doris.nereids.rules.rewrite.AdjustNullable;
 import org.apache.doris.nereids.trees.expressions.Alias;
-import org.apache.doris.nereids.trees.expressions.CaseWhen;
 import org.apache.doris.nereids.trees.expressions.ExprId;
+import org.apache.doris.nereids.trees.expressions.NullToNonNullFunction;

Review Comment:
   FE checkstyle sorts imports within a group alphabetically 
(`CustomImportOrder` has `sortImportsInGroupAlphabetically=true`). This new 
import is currently between `ExprId` and `Expression`, so the file is not 
sorted and the FE style gate should fail. Please move `NullToNonNullFunction` 
below `NamedExpression` (matching the order in `EagerAggRewriter.java`) before 
rerunning the FE validation.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to