When you perform a .groupBy, you need to perform an aggregate immediately afterwards.
For example: val df1 = df.groupBy("colA").agg(sum(df1("colB"))) df1.show() More information and examples can be found in the documentation below. http://spark.apache.org/docs/1.6.2/api/scala/index.html#org.apache.spark.sql.DataFrame Thanks, Kevin On Fri, Sep 30, 2016 at 5:46 AM, AJT <at...@currenex.com> wrote: > I'm looking to do the following with my Spark dataframe > (1) val df1 = df.groupBy(<long timestamp column>) > (2) val df2 = df1.sort(<long timestamp column>) > (3) val df3 = df2.mapPartitions(<set of aggregating functions>) > > I can already groupBy the column (in this case a long timestamp) - but have > no idea how then to ensure the returned GroupedData is then sorted by the > same timeStamp and the mapped to my set of functions > > Appreciate any help > Thanks > > > > -- > View this message in context: http://apache-spark-user-list. > 1001560.n3.nabble.com/Dataframe-Grouping-Sorting-Mapping-tp27821.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >