hi Micah,

I think it's most productive to view things through the lens of the
binary protocol (i.e. the IPC/RPC wire format)

On Wed, Mar 27, 2019 at 1:59 AM Micah Kornfield <emkornfi...@gmail.com> wrote:
>
> Similar to how there is no validity-buffer required in the format if
> null_count == 0, is a similar optimization for the "data buffer" allowed
> when (null_count == array length)?  It seems that if all values are null,
> no data element should ever be accessed, but I couldn't find if this was
> ever discussed.

My understanding is that it's permissible to send a length-0 buffer in
the IPC format for the validity bitmap in the event that the values
are all null. The receiver may decide to allocate a bitmap of all set
bits if they need that. I think Java needs this, but I am not sure if
this case has been implemented and tested (it should be if it is not)

In principle I don't see an issue with having an analogous
optimization for all-null arrays with the data buffer, but no
implementation AFAIK allows for this currently. This probably would
merit some discussion and probably a change to the format documents if
we wanted to allow this.

As a high level guiding principle, I don't see value in sending ersatz
buffers on the wire whose contents are never used. Similarly, creating
a validity bitmap of all 1's unnecessarily, which adds overhead in the
case where known non-nullable data is being processed.

The downside is that receivers may need to sanitize a null / length-0
buffer in such cases or have a special case in some algorithms

- Wes

>
> A quick perusal of the spec seems to imply they are required (I couldn't
> find an exception).  But I just wanted to confirm this was the case.
>
> Thanks,
> Micah

Reply via email to