Hello all,

I am attempting to serialize and then deserialize individual RecordBatch 
objects, using the C++ library. However, I’m getting an “Invalid” result on the 
deserialization end.

On the serialization end, with the help of some methods THROW_NOT_OK that throw 
on non-OK Status and Result (and returns latter case returns the inner Value), 
I’m serializing using

// batch is a valid std::shared_ptr<RecordBatch>
auto bufferStream = THROW_NOT_OK(io::BufferOutputStream::Create(1024));
auto batchWriter = THROW_NOT_OK(ipc::MakeStreamWriter(bufferStream, 
batch->schema()));
auto writeStatus = THROW_NOT_OK(batchWriter->WriteRecordBatch(*batch));
THROW_NOT_OK(batchWriter->Close());
auto batchBuffer = THROW_NOT_OK(bufferStream->Finish());

// pass this data along

The size of the buffer thus created is 1800. On the other end of the channel, I 
try to deserialize an in-memory copy of this IPC data using


// bufferPtr is a uint8_t* const location in memory and bufferSize a number of 
bytes
auto arrowBuffer = std::make_shared<Buffer>(bufferPtr, bufferSize); // no-copy 
wrap
auto bufferReader = std::make_shared<io::BufferReader>(arrowBuffer);
auto batchReader = 
THROW_NOT_OK(ipc::RecordBatchStreamReader::Open(bufferReader));


But, the last step fails, with a non-OK result with message

Invalid: Expected to read 165847040 metadata bytes, but only read 1796


The metadata bytes size is way off, given the serialized RecordBatch was 1800 
bytes to begin with. The number of bytes read looks about right, modulo that 
difference of 4. I saw some similar questions in the archives and online but 
the issues in them tended to be that the Close() step was missing. Other 
suggestions are a mismatch in the reader/writer format; I am using ones that 
look to me to be appropriately paired IPC stream I/O objects. Does some sort of 
header need to be written to the stream before the RecordBatch? Or, I did not 
use the second overloaded WriteRecordBatch method that takes a metadata object 
as the second argument, and the message mentions metadata bytes; is that 
relevant?

Best,

Jack Wimberley

Reply via email to