GitHub user rspears74 closed a discussion: How to handle `null` values in a 
UDAF?

I am trying to handle null values in the `update_batch` method of an 
`Accumulator` when defining a UDAF. I've been trying something like the 
following:
```
fn update_batch(&mut self, values: &[ArrayRef]) -> Result<()> {
    let vals: Vec<String> = (&values[0])
        .as_string::<i32>()
        .iter()
        .flatten()
        .map(|s| String::from(s))
        .collect();

    self.merge_values(vals);
    Ok(())
}
```
... but I get 
```
thread 'tokio-runtime-worker' panicked at 
/.../.cargo/registry/src/index.crates.io-6f17d22bba15001f/arrow-data-49.0.0/src/transform/list.rs:36:69:
range end index 2 out of range for slice of length 1
```

I was able to make it work for numeric types by sort of mimicking what some of 
the internal aggregators do:
```
fn update_batch(&mut self, values: &[ArrayRef]) -> Result<()> {
    let vals = (&values[0])
        .as_primitive::<Float64Type>()
        .values()
        .to_vec();

    self.merge_values(vals);
    Ok(())
}
```
... but this doesn't seem possible with Strings since calling `values` on a 
`StringArray` returns a `Vec<u8>` instead of a `Vec<&str>` (or `String`). Is 
there something I'm doing wrong in the first example where I'm using `flatten`? 
That should flatten a list of `[Some("string"), Some("other string"), None]` to 
`["string", "other string"]`, but I suppose there's something going on 
internally that's not allowing that to work.

GitHub link: https://github.com/apache/datafusion/discussions/8974

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: 
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to