One other bit of information: I occasionally get the error message "Tried
reading schema message, was null or length 0” instead, in the exact same
context.
On Mar 4, 2025 at 4:34 PM -0500, Jack Wimberley <jwimber...@paradigm4.com>,
wrote:
> Hello all,
>
> I am attempting to serialize and then deserialize individual RecordBatch
> objects, using the C++ library. However, I’m getting an “Invalid” result on
> the deserialization end.
>
> On the serialization end, with the help of some methods THROW_NOT_OK that
> throw on non-OK Status and Result (and returns latter case returns the inner
> Value), I’m serializing using
>
> // batch is a valid std::shared_ptr<RecordBatch>
> auto bufferStream = THROW_NOT_OK(io::BufferOutputStream::Create(1024));
> auto batchWriter = THROW_NOT_OK(ipc::MakeStreamWriter(bufferStream,
> batch->schema()));
> auto writeStatus = THROW_NOT_OK(batchWriter->WriteRecordBatch(*batch));
> THROW_NOT_OK(batchWriter->Close());
> auto batchBuffer = THROW_NOT_OK(bufferStream->Finish());
>
> // pass this data along
>
> The size of the buffer thus created is 1800. On the other end of the channel,
> I try to deserialize an in-memory copy of this IPC data using
>
>
> // bufferPtr is a uint8_t* const location in memory and bufferSize a number
> of bytes
> auto arrowBuffer = std::make_shared<Buffer>(bufferPtr, bufferSize); //
> no-copy wrap
> auto bufferReader = std::make_shared<io::BufferReader>(arrowBuffer);
> auto batchReader =
> THROW_NOT_OK(ipc::RecordBatchStreamReader::Open(bufferReader));
>
>
> But, the last step fails, with a non-OK result with message
>
> Invalid: Expected to read 165847040 metadata bytes, but only read 1796
>
>
> The metadata bytes size is way off, given the serialized RecordBatch was 1800
> bytes to begin with. The number of bytes read looks about right, modulo that
> difference of 4. I saw some similar questions in the archives and online but
> the issues in them tended to be that the Close() step was missing. Other
> suggestions are a mismatch in the reader/writer format; I am using ones that
> look to me to be appropriately paired IPC stream I/O objects. Does some sort
> of header need to be written to the stream before the RecordBatch? Or, I did
> not use the second overloaded WriteRecordBatch method that takes a metadata
> object as the second argument, and the message mentions metadata bytes; is
> that relevant?
>
> Best,
>
> Jack Wimberley