Hi,

I'm using the Arrow C++ API with an external RPC library to
exchange Arrow data as record batches. I can not figure out how
to use the arrow::Buffer interface with my RPC library properly.

Simplified version of the flow -- on the receiver of Arrow data,
my RPC library hands me a uint8_t* pointer to the serialized
Arrow data, and expects me to call rpc_free(ptr) once I am done
using it.

- First, I call arrow::Buffer::Wrap to get a buffer from that
pointer - Then I deserialize the buffer using this sequence, get
a shared_ptr to RecordBatch

    // buffer: input, std::shared_ptr<arrow::Buffer> auto
    input_stream =
    std::make_shared<arrow::io::BufferReader>(buffer); auto
    maybe_reader =
    arrow::ipc::RecordBatchStreamReader::Open(input_stream); auto
    maybe_batch = maybe_reader.ValueOrDie()->Next(); return
    maybe_batch.ValueOrDie();

I know that Buffer::Wrap does not take over the underlying buffer
with Wrap(), so I keep the RPC pointer alive in a separate
shared_ptr kept alive using .

This works as long as the data is within the control of my
system, but breaks as soon as I need to hand it over to an
external system (such as responding to a DoGet() in a Flight
call). I have no way of knowing when the DoGet() will complete
and when to free my buffer.

I feel that it would be straightforward for arrow::Buffer to have
a constructor that takes an std::unique_ptr to an "owner" and
keeps it alive as long as it is needed. Does something like this
exist, or is on the cards? The alternative I think is to memcpy
data to memory owned by arrow::Buffer, which I was hoping to
avoid.

Thank you!
Ankush

Reply via email to