John Muehlhausen created ARROW-5916:
---------------------------------------
Summary: [C++] Allow RecordBatch.length to be less than array
lengths
Key: ARROW-5916
URL: https://issues.apache.org/jira/browse/ARROW-5916
Project: Apache Arrow
Issue Type: New Feature
Reporter: John Muehlhausen
Attachments: test.arrow_ipc
0.13 ignored RecordBatch.length. 0.14 requires that RecordBatch.length and
array length be equal. As per
[https://lists.apache.org/thread.html/2692dd8fe09c92aa313bded2f4c2d4240b9ef75a8604ec214eb02571@%3Cdev.arrow.apache.org%3E]
, we discussed changing this so that RecordBatch.length can be [0,array
length].
If RecordBatch.length is less than array length, the reader should ignore the
portion of the array(s) beyond RecordBatch.length. This will allow partially
populated batches to be read in scenarios identified in the above discussion.
{code:c++}
Status GetFieldMetadata(int field_index, ArrayData* out) {
auto nodes = metadata_->nodes();
// pop off a field
if (field_index >= static_cast<int>(nodes->size())) {
return Status::Invalid("Ran out of field metadata, likely malformed");
}
const flatbuf::FieldNode* node = nodes->Get(field_index);
* //out->length = node->length();*
* out->length = metadata_->length();*
out->null_count = node->null_count();
out->offset = 0;
return Status::OK();
}
{code}
Attached is a test IPC File containing a batch with length 1, array length 3.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)