Hi Wenbo,

I'd like to known what's the *three* `buffers` are in ArraySpan. What are
> `1` means when `GetValues` called?

The meaning of buffers in an ArraySpan depends on the layout of its data
type. FixedSizeBinary is a fixed-size primitive type, so it has two
buffers, one validity buffer and one data buffer. So GetValues(1) would
return a pointer to the data buffer.
Layouts of data types can be found here[1].

what is the actual type should I get from `GetValues`?
>
Buffer data is stored as raw bytes (uint8_t) but can be reinterpreted as
any type to suit your need. The template parameter for GetValue is simply
forwarded to reinterpret_cast. There are discussions[2] on the soundness of
using uint8_t to represent bytes but it is what we use now. Since you are
only doing a memcpy, uint8_t should be good.

Maybe, `auto *out_values = out->array_span_mutable()->GetValues(uint8_t
> *>(1);` and `memcpy(*out_values++, some_ptr, 32);`?
>
I may be missing something, but why copy to *out_values++ instead of
*out_values and add 32 to out_values afterwards? Otherwise I agree this is
the way to go.

[1]
https://arrow.apache.org/docs/format/Columnar.html#buffer-listing-for-each-layout
[2] https://github.com/apache/arrow/issues/36123


On Mon, Jul 17, 2023 at 4:44 PM Wenbo Hu <huwenbo1...@gmail.com> wrote:

> Hi,
>     I'm using Acero as  the stream executor to run large scale data
> transformation. The core data used in UDF is `ArraySpan` in
> `ExecSpan`, but not much document on ArraySpan. I'd like to known
> what's the *three* `buffers` are in ArraySpan. What are `1` means when
> `GetValues` called?
>     For input data, I can use a `ArraySpanVisitor` to iterator over
> different input types. But for output data, I don't know how to write
> to the`array_span_mutable()` if it is not a simple c_type.
>     For example, I'm implementing a sha256 udf, which input is
> `arrow::utf8()` and the output is `arrow::fixed_size_binary(32)`, then
> how can I directly write to the out buffers and what is the actual
> type should I get from `GetValues`?
>     Maybe, `auto *out_values =
> out->array_span_mutable()->GetValues(uint8_t *>(1);` and
> `memcpy(*out_values++, some_ptr, 32);`?
>
> --
> ---------------------
> Best Regards,
> Wenbo Hu,
>

Reply via email to