I just find that MutableAggregationBuffer.update will convert data for every
update, which is terrible when I use something like Map, Array.
It is hard to implement a collect_set udaf, which will be O(n^2) in this
convert semantic.
Any advice?
--
View this message in context:
http://apache-sp
Got it.
Only NamedExpression have exprId, we have to make new Attribute here.
private[this] val computedSchema = computedAggregates.map(_.resultAttribute)
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/SparkSQL-Why-this-AttributeReference-exprId-i
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/Aggregate.scala#L85
I can't understand this code, it seems to be a "bug", but group by of
SparkSQL just works fine.
with code below, some expressions are mapping to AttributeReferences, then
"bindRe