sfluor opened a new issue, #16517:
URL: https://github.com/apache/datafusion/issues/16517

   ### Describe the bug
   
   When aggregating with `array_agg` using a limited memory pool we hit 
ResourceExhausted errors in the `array_agg` accumulator.
   
   After investigation this seems to come from over-accounting memory from 
shared arrow buffers. We might keep a single value from a 8k rows arrow array 
but will account the memory taken by the whole arrow array rather than just the 
value we kept.
   
   ### To Reproduce
   
   Run `datafusion-cli` with a memory limit `datafusion-cli -m 100m`
   
   Create a table:
   
   ```sql
   CREATE table logs AS
   
   SELECT 
   ('Chrome-' || (random() * 1000)::int) as user_agent,
       (n.value % 40000 )::text as client_id,
       (n.value % 21 ==0) as is_internal
   FROM
   generate_series(1, 300000) n;
   ```
   
   Try running a simple `array_agg` query:
   
   ```
   WITH
     user_agents AS (
       SELECT 
       DISTINCT user_agent, client_id
       FROM
        logs
       WHERE
         NOT is_internal
     )
   SELECT
     client_id,
     ARRAY_AGG(user_agent) AS user_agent
   FROM
     user_agents
   GROUP BY
     client_id
   ```
   
   Fails with:
   
   ```
   Resources exhausted: Additional allocation failed with top memory consumers 
(across reservations) as:
     GroupedHashAggregateStream[6] (array_agg(user_agents.user_agent))#44(can 
spill: true) consumed 2.9 MB,
     GroupedHashAggregateStream[2] (array_agg(user_agents.user_agent))#36(can 
spill: true) consumed 2.2 MB,
     GroupedHashAggregateStream[3] (array_agg(user_agents.user_agent))#38(can 
spill: true) consumed 2.2 MB.
   Error: Failed to allocate additional 609.0 MB for 
GroupedHashAggregateStream[0] (array_agg(user_agents.user_agent)) with 0.0 B 
already allocated for this reservation - 75.3 MB remain available 
   for the total pool
   ```
   
   Further inspection (adding debug prints) shows that we over-report memory in 
the `array_agg` `merge_batch` 
[function](https://github.com/apache/datafusion/blob/main/datafusion/functions-aggregate/src/array_agg.rs#L344):
   
   ```
   Addr:0x418911c18b0 Data type:Utf8 Length: 1, Size of first value: 9, Mem 
size: 162362
   [X][Utf8] 1 Size before: 56 + mem_size:162362 -> 9
   [X][Utf8] 1 Size after: 162482
   ```
   
   
   
   ### Expected behavior
   
   Query should succeed within the memory limits
   
   ### Additional context
   
   Similar to: https://github.com/apache/datafusion/issues/16055


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to