gengliangwang commented on code in PR #54492:
URL: https://github.com/apache/spark/pull/54492#discussion_r2886755590
##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala:
##########
@@ -370,7 +370,16 @@ abstract class Optimizer(catalogManager: CatalogManager)
s.withNewPlan(removeTopLevelSort(newPlan))
}
- def apply(plan: LogicalPlan): LogicalPlan =
plan.transformAllExpressionsWithPruning(
+ // optimizes subquery expressions, ignoring row-level operation conditions
+ def apply(plan: LogicalPlan): LogicalPlan = {
+ plan.transformWithPruning(_.containsPattern(PLAN_EXPRESSION), ruleId) {
+ case wd: WriteDelta => wd
+ case rd: ReplaceData => rd
Review Comment:
Minor: consider matching on `RowLevelWrite` instead of listing `WriteDelta`
and `ReplaceData` individually. This would be more future-proof if new
`RowLevelWrite` subtypes are added, and `RowLevelWrite` is already available
via the `plans.logical._` wildcard import.
```scala
case _: RowLevelWrite => plan // or use a named binding
```
That said, the explicit listing is fine too — it makes the scope of the
short-circuit clear.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]