Rachelint commented on code in PR #13681:
URL: https://github.com/apache/datafusion/pull/13681#discussion_r1933239306
##########
datafusion/functions-aggregate/src/median.rs:
##########
@@ -230,6 +276,212 @@ impl<T: ArrowNumericType> Accumulator for
MedianAccumulator<T> {
}
}
+/// The median groups accumulator accumulates the raw input values
+///
+/// For calculating the accurate medians of groups, we need to store all values
+/// of groups before final evaluation.
+/// So values in each group will be stored in a `Vec<T>`, and the total group
values
+/// will be actually organized as a `Vec<Vec<T>>`.
+///
+#[derive(Debug)]
+struct MedianGroupsAccumulator<T: ArrowNumericType + Send> {
+ data_type: DataType,
+ group_values: Vec<Vec<T::Native>>,
Review Comment:
@korowa it means impl a `group accumulator` for `distinct count` not get a
obviously improvement?
It is really surprise for me, I am learning
https://github.com/apache/datafusion/pull/8721
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]