adriangb commented on PR #16732:
URL: https://github.com/apache/datafusion/pull/16732#issuecomment-3054703987

   Note that I added a HashJoinExec implementation to motivate this PR but 
remove it in 
[5940cca](https://github.com/apache/datafusion/pull/16732/commits/5940cca7c8ca9620781664425fb45d66aeedd726)
 because it lacks nuance necessary for a final complete correct implementation 
(it doesn't take into account the join type, etc.).
   
   I asked Claude to analyze the logical filter pushdown on joins and it 
reported the following:
   
   ```
     Join Types and Filter Push Down Rules
   
     For WHERE clause filters (lr_is_preserved):
     - Inner joins: Filters can be pushed to both sides
     - Left joins: Filters can only be pushed to the left side
     - Right joins: Filters can only be pushed to the right side
     - Full joins: Filters cannot be pushed to either side
     - Semi/Anti joins: Filters can be pushed to the preserved side only
   
     For ON clause filters (on_lr_is_preserved):
     - Inner joins: Filters can be pushed to both sides
     - Left joins: Filters can only be pushed to the right side
     - Right joins: Filters can only be pushed to the left side
     - Full joins: Filters cannot be pushed to either side
     - Semi/Anti joins: Different rules apply based on join variant
   
     Filter Restrictions
   
     The can_evaluate_as_join_condition function at line 255 shows that filters 
can only be converted to join conditions if they:
   
     Allowed expressions:
     - Column references
     - Literals
     - Placeholders
     - Scalar variables
     - Binary expressions
     - LIKE/SimilarTo predicates
     - NOT expressions
     - IS NULL/IS NOT NULL
     - CASE expressions
     - Cast expressions
     - Try cast expressions
     - Scalar functions
   
     Disallowed expressions:
     - Subqueries (EXISTS, IN subquery, scalar subquery)
     - Outer reference columns
     - UNNEST expressions
   
     Additionally, for non-inner joins, inferred predicates must strictly 
filter out NULLs to be pushed down to avoid incorrect results.
     ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to