Aha. I was looking in the wrong place. Buffer does have `bitand`, `bitor`
and `not` methods on it that seem to wrap the underlying `buffer_bin_and`
and `buffer_bin_or`, etc.

I'm still curious on whether it would make sense to offer some more
variants of those (`A && !B`, for instance) to avoid materializing
temporary buffers and/or creating a variant of filter that treats `null` as
`false`. If either of those make sense I'm happy to take a stab at them
sometime if there are any thoughts on the direction to take.

Thanks, and sorry for the spam!

-- Ben

On Wed, Feb 10, 2021 at 11:13 AM Ben Chambers <bchamb...@apache.org> wrote:

> Oh, another aspect of the issue that I forgot to mention is that
> `filter`(which I'm trying to use with these booleans) has this warning:
>
> "WARNING: the nulls of filter are ignored and the value on its slot is
> considered. Therefore, it is considered undefined behavior to pass filter
> with null values."
>
> So, I guess a third option would be a variant of `filter` which treated
> `null` as `false`.
>
> On Wed, Feb 10, 2021 at 10:50 AM Ben Chambers <bchamb...@apache.org>
> wrote:
>
>> I'm trying to implement something along the lines of "X if Y > Z", but
>> treating the case of Y or Z as null as "false". Interestingly, this is
>> difficult with the way the kernels are created:
>>
>> 1. `Y > Z` will treat `null > ???` as null.
>> "Perform left > right operation on two arrays. Non-null values are
>> greater than null values."
>>
>> 2. Ok, so maybe we write that as `(Y > Z) && not_null(Y)`.
>> "If either left or right value is null then the result is also null."
>>
>> Oh. So if the LHS is null, there is *no way* to get a boolean array with
>> a non-null value.
>>
>> 3. Ok, I'll go write my own operator to do this (`null_to_false` or
>> something like that).
>> I can do this, but it requires iterating over the booleans and combining
>> them. It seems like it would be easy to do using `buffer_bin_and`, but that
>> is only visible within the Arrow crate.
>>
>> First, am I missing something with the above analysis? Is there some way
>> to provide non-null values for a boolean array that has nulls?
>>
>> Second, if not any thoughts on a solution? The two options I see (without
>> changing behavior of existing kernels) would be:
>> 1. Add kernel(s) that provide a value in place of `null` (a general case
>> of the `null_to_false`). These could be specialized in the boolean case to
>> use the `buffer_bin_and` as appropriate.
>> 2. Expose the `buffer_bin_and` and `buffer_bin_or` methods so that I (as
>> a user) can write the kernel myself.
>>
>

Reply via email to