acking-you commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2798871646
> > [@acking-you](https://github.com/acking-you) the code needs to be
extended to support nulls (you can take a look at the true_count implementation
in arrow-rs to do this e
acking-you commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2798742956
> @acking-you the code needs to be extended to support nulls (you can take a
look at the true_count implementation in arrow-rs to do this efficiently).
I have an idea f
kosiew commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2798494451
hi @Dandandan
I am getting failed tests with
```rust
#[test]
fn test_all_one() -> Result<()> {
// Helper function to run tests and repo
Dandandan commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2796844672
Btw as a simple concept, I tested this yesterday to reduce execution time of
short circuiting all false / all true cases by -25% compared to `true_count` /
`false_count`:
alamb commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2792363126
`ShortCircuitStrategy` is a pretty neat idea
In my opinion, as long as the code is easy to understand, makes realistic
benchmarks faster, and doesn't regress existing perfo
Dandandan commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2792442938
> I don't know if you think it's a good idea? @alamb @Dandandan
I think it is a pretty good idea given that evaluation is so important.
--
This is an automated messag
acking-you commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2788923437
I have an idea that might improve the effectiveness of short-circuit
optimization, and it seems necessary to use `false_count` for evaluation
counting.
The current iss
Dandandan commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2786711757
Would be good to compare it with a boolean version of this as well, like
this, to see if it vectorizes better:
```
pub fn all_zero(&self) -> bool {
// plat
alamb commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2786424000
> sum += chunk[i].count_ones() as usize;
Maybe simply manually unrolling the loop to check 1024 bits at a time would
let llvm make the best code
Something
Dandandan commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2786341929
Interesting!
I think we probably can take some inspiration from arrow-rs aggregate code,
e.g. doing something like (?):
```rust
/// Counts the number of on
acking-you commented on issue #15631:
URL: https://github.com/apache/datafusion/issues/15631#issuecomment-2786166445
This might require manual SIMD for optimization, but that would increase the
porting difficulty([As duckdb
says](https://duckdb.org/faq.html#does-duckdb-use-simd)). However,
alamb opened a new issue, #15631:
URL: https://github.com/apache/datafusion/issues/15631
### Is your feature request related to a problem or challenge?
@acking-you 's wonderful PR https://github.com/apache/datafusion/pull/15462
adds short circuiting to boolean operation evaluation whi
12 matches
Mail list logo