EmilyMatt opened a new issue, #14851:
URL: https://github.com/apache/datafusion/issues/14851

   ### Describe the bug
   
   In the GroupedHashAggregateStream, update_memory_reservation() has a 
try_resize() call, an error returned from this function, is usually handled 
gracefully, as we are usually in the middle of emitting and can just emit 
again, or are accumulating and can spill(unless not even a single batch fits in 
memory, in which case this can be a justified panic).
   
   However, when creating a merge stream using the spill data, there is no 
viable error fallback, this early exits on error if the memory pool denied the 
reservation.
   Using a custom memory pool, it should be possible to implement a "burst" 
allocation or something similar, and avoid aborting a query, but this requires 
the aggregate a "declaration of intentions" of sorts, shouldn't resize() be 
used in such cases, instead of try_resize()?
   
   This is the problematic line:
   
https://github.com/apache/datafusion/blob/6c5f214643fe81affcad5f3aa9031f02c4390bf0/datafusion/physical-plan/src/aggregates/row_hash.rs#L1050
   
   I believe in line 910 as well,
   `match self.update_memory_reservation() {
               // Here we can ignore `insufficient_capacity_err` because we 
will spill later,
               // but at least one batch should fit in the memory
               Err(DataFusionError::ResourcesExhausted(_))
                   if self.group_values.len() >= self.batch_size =>
               {
                   Ok(())
               }
               other => other,
           }`
   
   This should check preemptively if we have a possibility of spilling here, 
and if not, it should be a resize(), not a try_resize() as 
update_memory_reservation() does.
   But this one actually has a possibility of not aborting so seems more 
agreeable.
   
   It's possible that there are other design ideas taken into consideration 
here, but this is my belief at least.
   
   ### To Reproduce
   
   _No response_
   
   ### Expected behavior
   
   _No response_
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to