Re: [PR] [SPARK-55695][SQL] Avoid double planning in row-level operations [spark]

via GitHub Wed, 04 Mar 2026 15:54:33 -0800


gengliangwang commented on code in PR #54492:
URL: https://github.com/apache/spark/pull/54492#discussion_r2886755590



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -370,7 +370,16 @@ abstract class Optimizer(catalogManager: CatalogManager)
       s.withNewPlan(removeTopLevelSort(newPlan))
     }
 
-    def apply(plan: LogicalPlan): LogicalPlan = 
plan.transformAllExpressionsWithPruning(
+    // optimizes subquery expressions, ignoring row-level operation conditions
+    def apply(plan: LogicalPlan): LogicalPlan = {
+      plan.transformWithPruning(_.containsPattern(PLAN_EXPRESSION), ruleId) {
+        case wd: WriteDelta => wd
+        case rd: ReplaceData => rd

Review Comment:
   Minor: consider matching on `RowLevelWrite` instead of listing `WriteDelta` 
and `ReplaceData` individually. This would be more future-proof if new 
`RowLevelWrite` subtypes are added, and `RowLevelWrite` is already available 
via the `plans.logical._` wildcard import.
   
   ```scala
   case _: RowLevelWrite => plan // or use a named binding
   ```
   
   That said, the explicit listing is fine too — it makes the scope of the 
short-circuit clear.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-55695][SQL] Avoid double planning in row-level operations [spark]

Reply via email to