hi Micah, I think it's most productive to view things through the lens of the binary protocol (i.e. the IPC/RPC wire format)
On Wed, Mar 27, 2019 at 1:59 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > > Similar to how there is no validity-buffer required in the format if > null_count == 0, is a similar optimization for the "data buffer" allowed > when (null_count == array length)? It seems that if all values are null, > no data element should ever be accessed, but I couldn't find if this was > ever discussed. My understanding is that it's permissible to send a length-0 buffer in the IPC format for the validity bitmap in the event that the values are all null. The receiver may decide to allocate a bitmap of all set bits if they need that. I think Java needs this, but I am not sure if this case has been implemented and tested (it should be if it is not) In principle I don't see an issue with having an analogous optimization for all-null arrays with the data buffer, but no implementation AFAIK allows for this currently. This probably would merit some discussion and probably a change to the format documents if we wanted to allow this. As a high level guiding principle, I don't see value in sending ersatz buffers on the wire whose contents are never used. Similarly, creating a validity bitmap of all 1's unnecessarily, which adds overhead in the case where known non-nullable data is being processed. The downside is that receivers may need to sanitize a null / length-0 buffer in such cases or have a special case in some algorithms - Wes > > A quick perusal of the spec seems to imply they are required (I couldn't > find an exception). But I just wanted to confirm this was the case. > > Thanks, > Micah