alamb opened a new issue, #14115:
URL: https://github.com/apache/datafusion/issues/14115

   ### Is your feature request related to a problem or challenge?
   
   DataFusion uses `BooleanBuffer` in several places to create Null buffers. I 
thought there was a clever optimization for handling data with no nulls which I 
filed in arrow-rs
   - https://github.com/apache/arrow-rs/issues/6973
   
   However, @tustvold pointed out that 
[`NullBufferBuilder`](https://docs.rs/arrow-buffer/latest/arrow_buffer/builder/struct.NullBufferBuilder.html)
 has exactly the optimization described:
   
   I looked at the DataFusion codebase and found we have several examples of 
using BooleanBufferBuilder rather than NullBufferBuilder:
   
   
https://github.com/search?q=repo%3Aapache%2Fdatafusion%20BooleanBufferBuilder&type=code
   
   
   It even has a reimplementation of the NullBufferBuilder optimization 🤦  : 
https://github.com/apache/datafusion/blob/63b94c8f9e128b938e81b7e867ce6256a94d67e6/datafusion/physical-plan/src/aggregates/group_values/null_builder.rs#L20-L32
   
   
   
   
   ### Describe the solution you'd like
   
   I would like to switch DataFusion to using `NullBufferBuilder` instead of 
`BooleanBufferBuilder` as much as possible
   
   Note that until the following PR is availble, this will involve adding an 
explicit dependency on `arrow_buffer`
   - https://github.com/apache/arrow-rs/issues/6975
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to