[SparkSQL][UDAF] CatalystTypeConverters for each update?

2016-07-19 Thread EarthsonLu
I just find that MutableAggregationBuffer.update will convert data for every update, which is terrible when I use something like Map, Array. It is hard to implement a collect_set udaf, which will be O(n^2) in this convert semantic. Any advice? -- View this message in context: http://apache-sp

Re: [SparkSQL][Solved] Why this AttributeReference.exprId is not setted?

2014-11-24 Thread EarthsonLu
Got it. Only NamedExpression have exprId, we have to make new Attribute here. private[this] val computedSchema = computedAggregates.map(_.resultAttribute) -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkSQL-Why-this-AttributeReference-exprId-i

[SparkSQL] Why this AttributeReference.exprId is not setted?

2014-11-24 Thread EarthsonLu
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/Aggregate.scala#L85 I can't understand this code, it seems to be a "bug", but group by of SparkSQL just works fine. with code below, some expressions are mapping to AttributeReferences, then "bindRe