alamb commented on code in PR #16346:
URL: https://github.com/apache/datafusion/pull/16346#discussion_r2138985658


##########
datafusion/functions-aggregate/src/array_agg.rs:
##########
@@ -313,7 +315,11 @@ impl Accumulator for ArrayAggAccumulator {
         };
 
         if !val.is_empty() {
-            self.values.push(val);
+            // The ArrayRef might be holding a reference to its original input 
buffer, so
+            // storing it here directly copied/compacted avoids over 
accounting memory
+            // not used here.
+            self.values
+                .push(make_array(copy_array_data(&val.to_data())));

Review Comment:
   I found this code confusing at first too so I tried to add some additional 
documentation
   - https://github.com/apache/datafusion/pull/16361
   
   Another thing I found might make this code easier to understand would be to 
refactor this into a function so it looks more like
   
   ```suggestion
                   .push(copy_array(val))
   ```
   
   Or something like that 
   ```rust
   /// Copies an array to a new array with mimimal memory overhead
   fn copy_array(array: &dyn Array) -> ArrayRef {
   ..
   }
   ```
   
   Or something like that .
   
   This is definitely not required just something that occured to me while 
reviewing



##########
datafusion/functions-aggregate/src/array_agg.rs:
##########
@@ -980,6 +994,56 @@ mod tests {
         Ok(())
     }
 
+    #[test]

Review Comment:
   I verified these tests cover the code in this PR -- they fail without the 
changes in the PR
   
   ```
   
   assertion `left == right` failed
     left: 2652
    right: 732
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to