alamb commented on code in PR #16346: URL: https://github.com/apache/datafusion/pull/16346#discussion_r2138985658
########## datafusion/functions-aggregate/src/array_agg.rs: ########## @@ -313,7 +315,11 @@ impl Accumulator for ArrayAggAccumulator { }; if !val.is_empty() { - self.values.push(val); + // The ArrayRef might be holding a reference to its original input buffer, so + // storing it here directly copied/compacted avoids over accounting memory + // not used here. + self.values + .push(make_array(copy_array_data(&val.to_data()))); Review Comment: I found this code confusing at first too so I tried to add some additional documentation - https://github.com/apache/datafusion/pull/16361 Another thing I found might make this code easier to understand would be to refactor this into a function so it looks more like ```suggestion .push(copy_array(val)) ``` Or something like that ```rust /// Copies an array to a new array with mimimal memory overhead fn copy_array(array: &dyn Array) -> ArrayRef { .. } ``` Or something like that . This is definitely not required just something that occured to me while reviewing ########## datafusion/functions-aggregate/src/array_agg.rs: ########## @@ -980,6 +994,56 @@ mod tests { Ok(()) } + #[test] Review Comment: I verified these tests cover the code in this PR -- they fail without the changes in the PR ``` assertion `left == right` failed left: 2652 right: 732 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org