Hi, As I know, Spark SQL doesn't provide native support for this feature now. After searching, I found only few database systems support it, e.g., PostgreSQL.
Actually based on the Spark SQL's aggregate system, I think it is not very difficult to add the support for this feature. The problem is how frequently this feature is needed for Spark SQL users and if it is worth adding this, because as I see, this feature is not very common. Alternative possible to achieve this in current Spark SQL, is to use Aggregator with Dataset API. You can write your custom Aggregator which has an user-defined JVM object as buffer to hold the input data into your aggregate function. But you may need to write necessary encoder for the buffer object. If you really need this feature, you may open a Jira to ask others' opinion about this feature. ----- Liang-Chi Hsieh | @viirya Spark Technology Center http://www.spark.tc/ -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Aggregating-over-sorted-data-tp19999p20273.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe e-mail: dev-unsubscr...@spark.apache.org