Hi,

Unlike Arrow in ORC when an entry is null it is only recorded in the PRESENT 
stream (equivalent to the validity bitmap in Arrow) but not in any DATA stream 
for any type including numeric types. Hence the notNull (aka PRESENT) and data 
buffers from ORC generally don’t have the same size.

However according to cpp/src/arrow/adaptes/orc/adapter_util.cc 
<http://adapter_util.cc/> line 126 it is possible to directly use AppendValues 
to call builder->AppendValues(source, length, valid_bytes) with builder being 
an Int64Builder with source and valid_bytes having different sizes which 
doesn’t seem to be reasonable. May I ask whether this is actually valid usage 
of AppendValues? Thanks!


Best,
Ying Zhou

Reply via email to