Rachelint commented on PR #11319:
URL: https://github.com/apache/datafusion/pull/11319#issuecomment-2212686882

   > > I suspect the remaining cases where we are using collect could be made 
more efficient using the Builder pattern?
   > 
   > I think the reason the Builder is faster for Strings / Binary is that due 
to how the references worked out, we can avoid copying the strings via 
`value.to_string()`
   > 
   > I don't think the Builder pattern is fundamentally better/worse than the 
`from_iter` (under the covers they all end up doing the same thing in arrow-rs 
I think)
   
   Yes, I tested the `Uint64Array` case in my POC, use `Builder` directly is a 
bit slower than use `from_iter` actually.
   
https://github.com/Rachelint/arrow-datafusion/blob/70b9f05e737e81c514259e70a8bd2e9f0ad8e725/datafusion/core/src/datasource/physical_plan/parquet/statistics.rs#L790
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to