yujun777 commented on code in PR #62742:
URL: https://github.com/apache/doris/pull/62742#discussion_r3356110310


##########
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/AddProjectForVolatileExpression.java:
##########
@@ -269,6 +271,85 @@ public <T extends Expression> Optional<Pair<List<T>, 
LogicalProject<Plan>>> rewr
         return Optional.of(Pair.of(newTargetsBuilder.build(), new 
LogicalProject<>(projects, plan.child(0))));
     }
 
+    private Optional<JoinRewriteResult> 
rewriteJoinExpressions(LogicalJoin<Plan, Plan> join,
+            Collection<Expression> targets) {
+        Map<Expression, Integer> volatileExpressionCounter = 
Maps.newLinkedHashMap();
+        Map<Expression, Set<Slot>> volatileExpressionSlots = 
Maps.newLinkedHashMap();
+        for (Expression target : targets) {
+            target.foreach(e -> {
+                Expression expr = (Expression) e;
+                if (expr.isVolatile()) {
+                    volatileExpressionCounter.merge(expr, 1, Integer::sum);
+                    Set<Slot> inputSlots = expr.getInputSlots();
+                    volatileExpressionSlots
+                            .computeIfAbsent(expr, ignored -> 
Sets.newLinkedHashSet())
+                            .addAll(inputSlots.isEmpty() ? 
target.getInputSlots() : inputSlots);
+                }
+            });
+        }
+
+        ImmutableList.Builder<NamedExpression> leftAliases = 
ImmutableList.builder();
+        ImmutableList.Builder<NamedExpression> rightAliases = 
ImmutableList.builder();
+        Map<Expression, Slot> replaceMap = Maps.newHashMap();
+        Set<Slot> leftOutputSet = join.left().getOutputSet();
+        Set<Slot> rightOutputSet = join.right().getOutputSet();
+        for (Entry<Expression, Integer> entry : 
volatileExpressionCounter.entrySet()) {
+            if (entry.getValue() <= 1) {
+                continue;

Review Comment:
   I think this is the less-bad behavior under Doris's current join/project 
split. A  cannot represent a per-pair project for a volatile expression, so if 
the same volatile node appears multiple times after rewrites like  expansion, 
leaving it in the join means it will be re-evaluated independently for each 
occurrence and can diverge within the same predicate. By materializing it into 
one child project, we at least force repeated references to share a single 
value, which preserves the one-value-per-expression property that the original 
predicate needs. This is not fully equivalent to the join-pair semantics, but 
it is strictly better than keeping the duplicated volatile expression on the 
join and letting it be re-sampled separately.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to