Rachelint commented on PR #15591:
URL: https://github.com/apache/datafusion/pull/15591#issuecomment-2907882330

   > I wonder what the plan is for this PR?
   > 
   > From what I understand, it currently improves performance for aggregates 
with large numbers of groups, but (slightly) slows down aggregates for smaller 
numbers of groups. I think this is due to accessing group storage via two 
indirections (block index / actual index)
   > 
   > It seems like the 
[proposal](https://github.com/apache/datafusion/pull/15591#issuecomment-2890775426)
 is to have some sort of adaptive structure that uses one part indexes for 
small numbers of groups and then switches to two part indexes for larger 
numbers.
   
   I am experimenting if something we can do basing on this pr to improve 
performance more, like [memory reuse] 
(https://github.com/apache/datafusion/pull/15591#issuecomment-2890663260). 
Actually #16135 is part of the attempt.
   
   My biggest concern is if we can get more obvious improvement to make the 
change worthy...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to