morrySnow commented on code in PR #62742:
URL: https://github.com/apache/doris/pull/62742#discussion_r3354877345
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/PushDownFilterThroughSetOperation.java:
##########
@@ -61,11 +63,47 @@ public Rule build() {
.when(s -> s.arity() > 0
|| (s instanceof LogicalUnion && !((LogicalUnion)
s).getConstantExprsList().isEmpty())))
.thenApply(ctx -> {
- LogicalFilter<LogicalSetOperation> filter = ctx.root;
- LogicalSetOperation setOperation = filter.child();
+ LogicalFilter<LogicalSetOperation> origFilter = ctx.root;
+ LogicalSetOperation setOperation = origFilter.child();
+
+ // Pushing a conjunct that contains a volatile expression
(rand/uuid/random_bytes/...)
+ // into each branch changes semantics for every set-op except
UNION ALL.
+ // - UNION ALL: each branch row = exactly one output row
(1:1), so evaluating
+ // rand() once per branch row still matches the
per-output-row semantic.
+ // - UNION DISTINCT / INTERSECT / EXCEPT: the set-op semantics
depend on the
+ // full branch row sets before dedup/intersect/except.
Sampling rows in each
+ // branch independently changes which rows participate (e.g.
INTERSECT becomes
+ // "half of A intersect half of B" instead of "half of (A
intersect B)").
+ boolean canPushVolatileExpr = setOperation instanceof
LogicalUnion
+ && setOperation.getQualifier() == Qualifier.ALL;
+ Set<Expression> pushableConjuncts;
+ Set<Expression> keptAboveConjuncts;
+ if (canPushVolatileExpr) {
+ pushableConjuncts = origFilter.getConjuncts();
+ keptAboveConjuncts = ImmutableSet.of();
+ } else {
+ pushableConjuncts = new LinkedHashSet<>();
+ Set<Expression> kept = new LinkedHashSet<>();
+ for (Expression c : origFilter.getConjuncts()) {
+ if (c.containsVolatileExpression()) {
+ kept.add(c);
+ } else {
+ pushableConjuncts.add(c);
+ }
+ }
+ keptAboveConjuncts = kept;
+ if (pushableConjuncts.isEmpty()) {
+ return null;
+ }
+ }
+ LogicalFilter<LogicalSetOperation> filter = pushableConjuncts
== origFilter.getConjuncts()
Review Comment:
The reference equality check `pushableConjuncts ==
origFilter.getConjuncts()` determines whether to reuse the original filter or
create a new one. While currently correct (because the UNION ALL branch at line
82 assigns `pushableConjuncts = origFilter.getConjuncts()` — the same
reference), this is fragile: if `getConjuncts()` were ever changed to return a
defensive copy or a view, the `==` check would silently break.
Consider replacing it with an explicit boolean flag, which would be more
robust and self-documenting:
```java
boolean allConjunctsPushable = canPushVolatileExpr;
// ...
LogicalFilter<LogicalSetOperation> filter = allConjunctsPushable
? origFilter
: new LogicalFilter<>(ImmutableSet.copyOf(pushableConjuncts),
setOperation);
```
##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java:
##########
@@ -269,6 +271,85 @@ public <T extends Expression> Optional<Pair<List<T>,
LogicalProject<Plan>>> rewr
return Optional.of(Pair.of(newTargetsBuilder.build(), new
LogicalProject<>(projects, plan.child(0))));
}
+ private Optional<JoinRewriteResult>
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+ Collection<Expression> targets) {
+ Map<Expression, Integer> volatileExpressionCounter =
Maps.newLinkedHashMap();
+ Map<Expression, Set<Slot>> volatileExpressionSlots =
Maps.newLinkedHashMap();
+ for (Expression target : targets) {
+ target.foreach(e -> {
+ Expression expr = (Expression) e;
+ if (expr.isVolatile()) {
+ volatileExpressionCounter.merge(expr, 1, Integer::sum);
+ Set<Slot> inputSlots = expr.getInputSlots();
+ volatileExpressionSlots
+ .computeIfAbsent(expr, ignored ->
Sets.newLinkedHashSet())
+ .addAll(inputSlots.isEmpty() ?
target.getInputSlots() : inputSlots);
Review Comment:
The local variable `inputSlots` on line 283 (`Set<Slot> inputSlots =
expr.getInputSlots();`) shadows the meaning of `inputSlots` on line 286 which
refers to the same variable. This is correct behavior, but consider renaming
the inner-scope variable (e.g., `ownInputSlots` or `volatileOwnSlots`) to
distinguish the volatile expression's own input slots from the accumulated
context slots stored in the map. This would also clarify the intent at line 286
where `inputSlots.isEmpty()` checks the volatile expression's own slots, while
`target.getInputSlots()` provides the fallback context from the containing
expression.
Suggested rename:
```java
Set<Slot> ownSlots = expr.getInputSlots();
volatileExpressionSlots
.computeIfAbsent(expr, ignored -> Sets.newLinkedHashSet())
.addAll(ownSlots.isEmpty() ? target.getInputSlots() : ownSlots);
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]