yaooqinn opened a new pull request, #54440:
URL: https://github.com/apache/spark/pull/54440
### What changes were proposed in this pull request?
Replace `AlwaysProcess.fn` with pattern-based pruning in two Analyzer rules:
1. **EliminateSubqueryAliases**: Use `_.containsPattern(SUBQUERY_ALIAS)`
- Skips entire plan traversal when no `SubqueryAlias` nodes exist
- Common in resolved plans after initial resolution passes
2. **ResolveInlineTables**: Use `_.containsPattern(INLINE_TABLE_EVAL)`
- Skips traversal when no `UnresolvedInlineTable` nodes exist
- Inline tables are rare; most queries never contain them
Also adds `INLINE_TABLE_EVAL` to `UnresolvedInlineTable.nodePatterns`, which
was previously only defined on `ResolvedInlineTable`. Without this, the pruning
condition for `ResolveInlineTables` could never be satisfied for unresolved
inline tables.
Both rules previously used `AlwaysProcess.fn`, forcing full tree traversal
on every fixedPoint iteration even when no matching nodes existed.
TreePatternBits propagation enables O(1) root-level short-circuit.
### Why are the changes needed?
Performance optimization: avoids unnecessary full-plan traversals during
analysis when the relevant node types are absent.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Existing tests: `AnalysisSuite`, `EliminateSubqueryAliasesSuite`, and inline
table related tests all pass (491 tests).
### Was this patch authored or co-authored using generative AI tooling?
Yes, GitHub Copilot.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]