szehon-ho commented on code in PR #54459:
URL: https://github.com/apache/spark/pull/54459#discussion_r2893266385


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/PushDownUtils.scala:
##########
@@ -99,15 +141,40 @@ object PushDownUtils {
         // Data source filters that need to be evaluated again after scanning. 
which means
         // the data source cannot guarantee the rows returned can pass these 
filters.
         // As a result we must return it so Spark can plan an extra filter 
operator.
-        val postScanFilters = r.pushPredicates(translatedFilters.toArray).map 
{ predicate =>
-          DataSourceV2Strategy.rebuildExpressionFromFilter(predicate, 
translatedFilterToExpr)
+        val firstPassPostScanFilters = 
r.pushPredicates(translatedFilters.toArray).map {
+          predicate =>
+            DataSourceV2Strategy.rebuildExpressionFromFilter(predicate, 
translatedFilterToExpr)
+        }
+
+        // When partition schema is available (enhanced partition filter 
enabled as per
+        // SPARK-55596), calculate PartitionPredicates
+        val secondPassFilterExprs = untranslatableExprs.toSeq ++ 
firstPassPostScanFilters
+        val (partitionPredicates, untranslatableDataFilters) = partitionSchema 
match {
+          case Some(structType) =>
+            val (partitionExprs, dataExprs) =
+              DataSourceUtils.getPartitionFiltersAndDataFilters(structType, 
secondPassFilterExprs)
+            val preds = partitionExprs.map(expr =>
+              new PartitionPredicateImpl(expr, toAttributes(structType)))
+            (preds, dataExprs)
+          case None =>
+            (Seq.empty[PartitionPredicate], secondPassFilterExprs)
+        }
+
+        val finalPostScanFilters = if (r.supportsEnhancedPartitionFiltering()) 
{
+          val secondPassPostScanFilters = 
r.pushPredicates(partitionPredicates.toArray).map {
+            predicate =>
+              DataSourceV2Strategy.rebuildExpressionFromFilter(predicate, 
translatedFilterToExpr)
+          }
+          firstPassPostScanFilters ++ secondPassPostScanFilters ++ 
untranslatableDataFilters

Review Comment:
   good catch.  I updated this whole method.  I also realize the original 
implementation put the translatableFilter first as an optimization for the 
postScanFilters (see the comment below), so I updated the logic to do it as 
well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to