acking-you commented on code in PR #15462:
URL: https://github.com/apache/datafusion/pull/15462#discussion_r2023116930


##########
datafusion/physical-expr/src/expressions/binary.rs:
##########
@@ -805,6 +811,47 @@ impl BinaryExpr {
     }
 }
 
+/// Check if it meets the short-circuit condition
+/// 1. For the `AND` operator, if the `lhs` result all are `false`
+/// 2. For the `OR` operator, if the `lhs` result all are `true`
+/// 3. Otherwise, it does not meet the short-circuit condition
+fn check_short_circuit(arg: &ColumnarValue, op: &Operator) -> bool {
+    let data_type = arg.data_type();
+    match (data_type, op) {
+        (DataType::Boolean, Operator::And) => {
+            match arg {
+                ColumnarValue::Array(array) => {
+                    if let Ok(array) = as_boolean_array(&array) {
+                        return array.false_count() == array.len();

Review Comment:
   > Might be overkill, but one _could_ try a sampling approach: Run the loop 
with the early exit for the first few chunks, and then switch over to the 
unconditional loop.
   
   Thank you for your suggestion, but if we're only applying conditional checks 
to the first few blocks, then I feel this optimization might not be meaningful. 
If nearly all blocks can be filtered out by the preceding filter, the 
optimization will no longer be effective.
   
   >If we find that this slows down some other performance we could also add 
some sort of heuristic check to calling false_count / true_count -- like for 
example if the rhs arg is "complex" (not a Column for example)
   
   I tend to agree with @alamb's point that if the overhead of verification is 
somewhat unacceptable, adopting some heuristic approaches would be better.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to