Let me fix a typo On Wed, Mar 27, 2019 at 7:40 AM Wes McKinney <wesmck...@gmail.com> wrote: > > hi Micah, > > I think it's most productive to view things through the lens of the > binary protocol (i.e. the IPC/RPC wire format) > > On Wed, Mar 27, 2019 at 1:59 AM Micah Kornfield <emkornfi...@gmail.com> wrote: > > > > Similar to how there is no validity-buffer required in the format if > > null_count == 0, is a similar optimization for the "data buffer" allowed > > when (null_count == array length)? It seems that if all values are null, > > no data element should ever be accessed, but I couldn't find if this was > > ever discussed. > > My understanding is that it's permissible to send a length-0 buffer in > the IPC format for the validity bitmap in the event that the values > are all null. The receiver may decide to allocate a bitmap of all set > bits if they need that. I think Java needs this, but I am not sure if > this case has been implemented and tested (it should be if it is not) >
This should read "in the event that the value are all _not_ null" > In principle I don't see an issue with having an analogous > optimization for all-null arrays with the data buffer, but no > implementation AFAIK allows for this currently. This probably would > merit some discussion and probably a change to the format documents if > we wanted to allow this. > > As a high level guiding principle, I don't see value in sending ersatz > buffers on the wire whose contents are never used. Similarly, creating > a validity bitmap of all 1's unnecessarily, which adds overhead in the > case where known non-nullable data is being processed. > > The downside is that receivers may need to sanitize a null / length-0 > buffer in such cases or have a special case in some algorithms > > - Wes > > > > > A quick perusal of the spec seems to imply they are required (I couldn't > > find an exception). But I just wanted to confirm this was the case. > > > > Thanks, > > Micah