rluvaton commented on code in PR #17592: URL: https://github.com/apache/datafusion/pull/17592#discussion_r2354904775
########## datafusion/physical-plan/src/aggregates/group_values/multi_group_by/mod.rs: ########## @@ -88,6 +88,34 @@ pub trait GroupColumn: Send + Sync { /// The vectorized version `append_val` fn vectorized_append(&mut self, array: &ArrayRef, rows: &[usize]) -> Result<()>; + /// Whether this builder supports [`Self::append_array_slice`] optimization + /// In case it returns true, [`Self::append_array_slice`] must be implemented + fn support_append_array_slice(&self) -> bool { + false + } + + /// Append slice of values from `array`, starting at `start` for `length` rows + /// + /// This is a special case of `vectorized_append` when the rows are continuous + /// + /// You should implement this to optimize large copies of contiguous values. + /// + /// This does not get the sliced array even though it would be more user-friendly + /// to allow optimization that avoid the additional computation that can happen in a slice + /// + /// Note: in order for this to be used, [`Self::support_append_array_slice`] must return true + fn append_array_slice( + &mut self, + _array: &ArrayRef, + _start: usize, + _length: usize, + ) -> Result<()> { + assert!(!self.support_append_array_slice(), "support_append_array_slice() return true while append_array_slice() is not implemented"); + not_impl_err!( + "append_array_slice is not implemented for this GroupColumn, please implement it as well as support_append_array_slice" + ) + } Review Comment: Originally I did not had `support_append_array_slice` and I had this default implementation for `append_array_slice`: ```rust fn append_array_slice( &mut self, array: &ArrayRef, start: usize, length: usize, ) -> Result<()> { let rows = (start..start + length).collect::<Vec<_>>(); self.vectorized_append(array, &rows) } ``` but I moved out of this to avoid having the allocation here as we already hold the `append_row_indices`. as I saw from the benchmarks that it can be slower for non supported impl -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org