github-actions[bot] commented on code in PR #63763:
URL: https://github.com/apache/doris/pull/63763#discussion_r3311121242


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AggScalarSubQueryToWindowFunction.java:
##########
@@ -278,6 +289,12 @@ private boolean checkRelation(List<Slot> correlatedSlots) {
                 .filter(node -> outerIds.contains(node.getTable().getId()))
                 .map(LogicalRelation.class::cast)
                 
.map(LogicalRelation::getOutputExprIdSet).flatMap(Collection::stream).collect(Collectors.toSet());
+        partitionBySlots.addAll(apply.left().getOutput().stream()

Review Comment:
   This still does not distinguish duplicate outer-only rows when their 
distinguishing columns are not present in `apply.left().getOutput()`. For 
example, if `dim` has two rows with the same `k` and the outer query only 
outputs/uses `d.k` (no `d.did` or other unique column), the original scalar 
subquery is evaluated once per `dim` row, but this code partitions the window 
only by the visible `d.k`. The joined inner rows for both duplicate `dim` rows 
then land in the same window partition and the aggregate is multiplied, so 
predicates such as `f.v * 2 > (select sum(f2.v) ... where f2.k = d.k)` can 
incorrectly filter out rows. The new regression includes `d.did` in the select 
list, which makes this code include a distinguishing slot and misses this case. 
Please either carry/partition by all slots from the outer-only relation needed 
to preserve row identity, or make the rule return false when 
`apply.left().getOutput()` does not contain the full outer-only relation output.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to