acking-you commented on code in PR #15462: URL: https://github.com/apache/datafusion/pull/15462#discussion_r2023116930
########## datafusion/physical-expr/src/expressions/binary.rs: ########## @@ -805,6 +811,47 @@ impl BinaryExpr { } } +/// Check if it meets the short-circuit condition +/// 1. For the `AND` operator, if the `lhs` result all are `false` +/// 2. For the `OR` operator, if the `lhs` result all are `true` +/// 3. Otherwise, it does not meet the short-circuit condition +fn check_short_circuit(arg: &ColumnarValue, op: &Operator) -> bool { + let data_type = arg.data_type(); + match (data_type, op) { + (DataType::Boolean, Operator::And) => { + match arg { + ColumnarValue::Array(array) => { + if let Ok(array) = as_boolean_array(&array) { + return array.false_count() == array.len(); Review Comment: > Might be overkill, but one _could_ try a sampling approach: Run the loop with the early exit for the first few chunks, and then switch over to the unconditional loop. Thank you for your suggestion, but if we're only applying conditional checks to the first few blocks, then I feel this optimization might not be meaningful. If nearly all blocks can be filtered out by the preceding filter, the optimization will no longer be effective. >If we find that this slows down some other performance we could also add some sort of heuristic check to calling false_count / true_count -- like for example if the rhs arg is "complex" (not a Column for example) I tend to agree with @alamb's point that if the overhead of verification is somewhat unacceptable, adopting some heuristic approaches would be better. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org