Hi, Unlike Arrow in ORC when an entry is null it is only recorded in the PRESENT stream (equivalent to the validity bitmap in Arrow) but not in any DATA stream for any type including numeric types. Hence the notNull (aka PRESENT) and data buffers from ORC generally don’t have the same size.
However according to cpp/src/arrow/adaptes/orc/adapter_util.cc <http://adapter_util.cc/> line 126 it is possible to directly use AppendValues to call builder->AppendValues(source, length, valid_bytes) with builder being an Int64Builder with source and valid_bytes having different sizes which doesn’t seem to be reasonable. May I ask whether this is actually valid usage of AppendValues? Thanks! Best, Ying Zhou