cloud-fan commented on PR #54459:
URL: https://github.com/apache/spark/pull/54459#issuecomment-3998920073

   **Design: second-pass rejection should avoid `rebuildExpressionFromFilter`**
   
   The second pass pushes `PartitionPredicateImpl` objects (which wrap Catalyst 
expressions) and then routes any rejected ones through 
`DataSourceV2Strategy.rebuildExpressionFromFilter` to convert back to Catalyst 
expressions:
   
   ```scala
   val secondPassPostScanFilters = 
r.pushPredicates(partitionPredicates.toArray).map {
     predicate =>
       DataSourceV2Strategy.rebuildExpressionFromFilter(predicate, 
translatedFilterToExpr)
   }
   ```
   
   This requires adding a `PartitionPredicateImpl` case to 
`rebuildExpressionFromFilter`, which couples the generic V2 strategy layer to 
an internal partition predicate implementation. The `translatedFilterToExpr` 
map (built during the first pass) is also passed but never actually used for 
second-pass predicates -- it's only there because `rebuildExpressionFromFilter` 
requires it as a parameter.
   
   Since `PushDownUtils` already creates these `PartitionPredicateImpl` objects 
and knows they wrap Catalyst expressions, we can handle the round-trip locally 
without touching `DataSourceV2Strategy`:
   
   ```scala
   val finalPostScanFilters = if (r.supportsEnhancedPartitionFiltering()) {
     val rejectedPartitionPreds = r.pushPredicates(partitionPredicates.toArray)
     val secondPassPostScanFilters = rejectedPartitionPreds.map {
       _.asInstanceOf[PartitionPredicateImpl].expression
     }
     firstPassPostScanFilters ++ secondPassPostScanFilters ++ 
untranslatableDataFilters
   } else {
     firstPassPostScanFilters ++ untranslatableExprs
   }
   ```
   
   This removes the need to modify `rebuildExpressionFromFilter` entirely, 
keeps the `PartitionPredicateImpl` coupling within `PushDownUtils` (which 
already imports and creates it), and is more straightforward -- we created the 
wrappers, we unwrap them.
   
   ---
   _This comment was generated with [GitHub MCP](http://go/mcps)._


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to