morrySnow commented on code in PR #62742:
URL: https://github.com/apache/doris/pull/62742#discussion_r3354877345


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java:
##########
@@ -61,11 +63,47 @@ public Rule build() {
             .when(s -> s.arity() > 0
                     || (s instanceof LogicalUnion && !((LogicalUnion) 
s).getConstantExprsList().isEmpty())))
             .thenApply(ctx -> {
-                LogicalFilter<LogicalSetOperation> filter = ctx.root;
-                LogicalSetOperation setOperation = filter.child();
+                LogicalFilter<LogicalSetOperation> origFilter = ctx.root;
+                LogicalSetOperation setOperation = origFilter.child();
+
+                // Pushing a conjunct that contains a volatile expression 
(rand/uuid/random_bytes/...)
+                // into each branch changes semantics for every set-op except 
UNION ALL.
+                // - UNION ALL: each branch row = exactly one output row 
(1:1), so evaluating
+                //   rand() once per branch row still matches the 
per-output-row semantic.
+                // - UNION DISTINCT / INTERSECT / EXCEPT: the set-op semantics 
depend on the
+                //   full branch row sets before dedup/intersect/except. 
Sampling rows in each
+                //   branch independently changes which rows participate (e.g. 
INTERSECT becomes
+                //   "half of A intersect half of B" instead of "half of (A 
intersect B)").
+                boolean canPushVolatileExpr = setOperation instanceof 
LogicalUnion
+                        && setOperation.getQualifier() == Qualifier.ALL;
+                Set<Expression> pushableConjuncts;
+                Set<Expression> keptAboveConjuncts;
+                if (canPushVolatileExpr) {
+                    pushableConjuncts = origFilter.getConjuncts();
+                    keptAboveConjuncts = ImmutableSet.of();
+                } else {
+                    pushableConjuncts = new LinkedHashSet<>();
+                    Set<Expression> kept = new LinkedHashSet<>();
+                    for (Expression c : origFilter.getConjuncts()) {
+                        if (c.containsVolatileExpression()) {
+                            kept.add(c);
+                        } else {
+                            pushableConjuncts.add(c);
+                        }
+                    }
+                    keptAboveConjuncts = kept;
+                    if (pushableConjuncts.isEmpty()) {
+                        return null;
+                    }
+                }
+                LogicalFilter<LogicalSetOperation> filter = pushableConjuncts 
== origFilter.getConjuncts()

Review Comment:
   The reference equality check `pushableConjuncts == 
origFilter.getConjuncts()` determines whether to reuse the original filter or 
create a new one. While currently correct (because the UNION ALL branch at line 
82 assigns `pushableConjuncts = origFilter.getConjuncts()` — the same 
reference), this is fragile: if `getConjuncts()` were ever changed to return a 
defensive copy or a view, the `==` check would silently break.
   
   Consider replacing it with an explicit boolean flag, which would be more 
robust and self-documenting:
   ```java
   boolean allConjunctsPushable = canPushVolatileExpr;
   // ...
   LogicalFilter<LogicalSetOperation> filter = allConjunctsPushable
           ? origFilter
           : new LogicalFilter<>(ImmutableSet.copyOf(pushableConjuncts), 
setOperation);
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java:
##########
@@ -269,6 +271,85 @@ public <T extends Expression> Optional<Pair<List<T>, 
LogicalProject<Plan>>> rewr
         return Optional.of(Pair.of(newTargetsBuilder.build(), new 
LogicalProject<>(projects, plan.child(0))));
     }
 
+    private Optional<JoinRewriteResult> 
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+            Collection<Expression> targets) {
+        Map<Expression, Integer> volatileExpressionCounter = 
Maps.newLinkedHashMap();
+        Map<Expression, Set<Slot>> volatileExpressionSlots = 
Maps.newLinkedHashMap();
+        for (Expression target : targets) {
+            target.foreach(e -> {
+                Expression expr = (Expression) e;
+                if (expr.isVolatile()) {
+                    volatileExpressionCounter.merge(expr, 1, Integer::sum);
+                    Set<Slot> inputSlots = expr.getInputSlots();
+                    volatileExpressionSlots
+                            .computeIfAbsent(expr, ignored -> 
Sets.newLinkedHashSet())
+                            .addAll(inputSlots.isEmpty() ? 
target.getInputSlots() : inputSlots);

Review Comment:
   The local variable `inputSlots` on line 283 (`Set<Slot> inputSlots = 
expr.getInputSlots();`) shadows the meaning of `inputSlots` on line 286 which 
refers to the same variable. This is correct behavior, but consider renaming 
the inner-scope variable (e.g., `ownInputSlots` or `volatileOwnSlots`) to 
distinguish the volatile expression's own input slots from the accumulated 
context slots stored in the map. This would also clarify the intent at line 286 
where `inputSlots.isEmpty()` checks the volatile expression's own slots, while 
`target.getInputSlots()` provides the fallback context from the containing 
expression.
   
   Suggested rename:
   ```java
   Set<Slot> ownSlots = expr.getInputSlots();
   volatileExpressionSlots
           .computeIfAbsent(expr, ignored -> Sets.newLinkedHashSet())
           .addAll(ownSlots.isEmpty() ? target.getInputSlots() : ownSlots);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to