cloud-fan commented on PR #54459:
URL: https://github.com/apache/spark/pull/54459#issuecomment-3998920073
**Design: second-pass rejection should avoid `rebuildExpressionFromFilter`**
The second pass pushes `PartitionPredicateImpl` objects (which wrap Catalyst
expressions) and then routes any rejected ones through
`DataSourceV2Strategy.rebuildExpressionFromFilter` to convert back to Catalyst
expressions:
```scala
val secondPassPostScanFilters =
r.pushPredicates(partitionPredicates.toArray).map {
predicate =>
DataSourceV2Strategy.rebuildExpressionFromFilter(predicate,
translatedFilterToExpr)
}
```
This requires adding a `PartitionPredicateImpl` case to
`rebuildExpressionFromFilter`, which couples the generic V2 strategy layer to
an internal partition predicate implementation. The `translatedFilterToExpr`
map (built during the first pass) is also passed but never actually used for
second-pass predicates -- it's only there because `rebuildExpressionFromFilter`
requires it as a parameter.
Since `PushDownUtils` already creates these `PartitionPredicateImpl` objects
and knows they wrap Catalyst expressions, we can handle the round-trip locally
without touching `DataSourceV2Strategy`:
```scala
val finalPostScanFilters = if (r.supportsEnhancedPartitionFiltering()) {
val rejectedPartitionPreds = r.pushPredicates(partitionPredicates.toArray)
val secondPassPostScanFilters = rejectedPartitionPreds.map {
_.asInstanceOf[PartitionPredicateImpl].expression
}
firstPassPostScanFilters ++ secondPassPostScanFilters ++
untranslatableDataFilters
} else {
firstPassPostScanFilters ++ untranslatableExprs
}
```
This removes the need to modify `rebuildExpressionFromFilter` entirely,
keeps the `PartitionPredicateImpl` coupling within `PushDownUtils` (which
already imports and creates it), and is more straightforward -- we created the
wrappers, we unwrap them.
---
_This comment was generated with [GitHub MCP](http://go/mcps)._
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]