github-actions[bot] commented on code in PR #63763:
URL: https://github.com/apache/doris/pull/63763#discussion_r3435864545


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AggScalarSubQueryToWindowFunction.java:
##########
@@ -139,7 +140,8 @@ && checkAggregate()
                 && checkJoin()
                 && checkProject()
                 && checkRelation(apply.getCorrelationSlot())
-                && checkFilter(outerFilter);
+                && checkFilter(outerFilter)
+                && checkUniqueCorrelatedTable(apply.getCorrelationSlot());
     }

Review Comment:
   This new uniqueness check proves that the outer-only table has at most one 
row per correlated key, but the rewrite still accepts plans where the top outer 
filter has extra predicates on a relation that is also scanned inside the 
subquery. Reduced plan:
   
   ```text
   Filter(f.v > 6, f.v * 2 > sum_alias)
     Apply(correlation: d.k)
       CrossJoin
         Scan fact f
         Scan dim d   -- d.k is unique, so this new check passes
       Aggregate(sum(f2.v) AS sum_alias)
         Filter(f2.k = d.k)
           Scan fact f2
   ```
   
   `checkFilter` only proves that the inner conjunct `f2.k = d.k` is present in 
the outer filter after slot replacement. It does not reject the unmatched outer 
conjunct `f.v > 6`, and `rewrite` puts all `conjuncts.get(true)` predicates 
into `newFilter` below `LogicalWindow`. The generated `SUM(v) OVER (PARTITION 
BY d.k)` therefore sees only `fact` rows with `f.v > 6`, while the original 
scalar subquery sums all `fact` rows for the same `d.k`.
   
   For example, with one unique `dim` row `k=1` and `fact` values `5, 6, 7`, 
the original subquery sum is `18`, so row `v=7` fails `7 * 2 > 18`; after this 
rewrite the window sum is `7`, so the row is returned. Please either reject 
unmatched outer predicates that reference any shared/inner relation slot, or 
split predicates so only outer-only predicates are applied below the window 
while shared-side filters remain above it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to