Hi, Issue #1: I'm using the new UDAF interface (UserDefinedAggregateFunction) at Spark 1.5.0 release. Is it possible to aggregate all values in the MutableAggregationBuffer into an array in a robust manner? I'm creating an aggregation function that collects values into an array from all input rows, and then calculates the final result from the UDAF using the array/list as input value. The issue I'm running into is that the values contained in the MutableAggregationBuffer are immutable, which means I need to create a copy of the array every time I append a new value. This of course makes it very slow for any significant number of elements.
Issue #2: I also tried the hive 'collect_list' UDAF, but as the input values are UDTs, I'm getting scala.MatchError as a result. I suppose the hive UDAFs only work with primitive parameters!? -JP -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/UDAF-and-UDT-with-SparkSQL-1-5-0-tp24670.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org