korowa commented on PR #15324:
URL: https://github.com/apache/datafusion/pull/15324#issuecomment-2739393073
   Thank you @waynexia, I'm planning to check it out at most tomorrow.
   
   I have a question in advance before reviewing -- have you been considering 
to implement groups accumulator for 
[specialized](https://github.com/apache/datafusion/tree/main/datafusion/functions-aggregate-common/src/aggregate/count_distinct)
 cases of DistinctCountAccumulator (primitive/native types and bytes)?
   
   I'm asking because, as for me, it looks a bit odd (though I haven't 
rechecked performance results, and perhaps GroupsAccumulatorAdapter introduces 
some insane overhead), that switching from native Rust types to ScalarValue 
still gives x5 faster execution, while groups accumulator in this case, if I'm 
not mistaken, does basically the same as the GroupsAccumulatorAdapter -- 
storing separate states (hashsets) in the vector is already 
[implemented](https://github.com/apache/datafusion/blob/97548a2584614acb3211b862e1ebca8349b6a832/datafusion/functions-aggregate-common/src/aggregate/groups_accumulator.rs#L96)
 in the adapter.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to