rluvaton commented on code in PR #17592:
URL: https://github.com/apache/datafusion/pull/17592#discussion_r2354904775


##########
datafusion/physical-plan/src/aggregates/group_values/multi_group_by/mod.rs:
##########
@@ -88,6 +88,34 @@ pub trait GroupColumn: Send + Sync {
     /// The vectorized version `append_val`
     fn vectorized_append(&mut self, array: &ArrayRef, rows: &[usize]) -> 
Result<()>;
 
+    /// Whether this builder supports [`Self::append_array_slice`] optimization
+    /// In case it returns true, [`Self::append_array_slice`] must be 
implemented
+    fn support_append_array_slice(&self) -> bool {
+        false
+    }
+
+    /// Append slice of values from `array`, starting at `start` for `length` 
rows
+    ///
+    /// This is a special case of `vectorized_append` when the rows are 
continuous
+    ///
+    /// You should implement this to optimize large copies of contiguous 
values.
+    ///
+    /// This does not get the sliced array even though it would be more 
user-friendly
+    /// to allow optimization that avoid the additional computation that can 
happen in a slice
+    ///
+    /// Note: in order for this to be used, 
[`Self::support_append_array_slice`] must return true
+    fn append_array_slice(
+        &mut self,
+        _array: &ArrayRef,
+        _start: usize,
+        _length: usize,
+    ) -> Result<()> {
+        assert!(!self.support_append_array_slice(), 
"support_append_array_slice() return true while append_array_slice() is not 
implemented");
+        not_impl_err!(
+            "append_array_slice is not implemented for this GroupColumn, 
please implement it as well as support_append_array_slice"
+        )
+    }

Review Comment:
   Originally I did not had `support_append_array_slice` and I had this default 
implementation for `append_array_slice`:
   ```rust
   fn append_array_slice(
           &mut self,
           array: &ArrayRef,
           start: usize,
           length: usize,
       ) -> Result<()> {
           let rows = (start..start + length).collect::<Vec<_>>();
           self.vectorized_append(array, &rows)
       }
   ```
   
   but I moved out of this to avoid having the allocation here as we already 
hold the `append_row_indices`.
   as I saw from the benchmarks that it can be slower for non supported impl



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to