Re: Dataframe Grouping - Sorting - Mapping

Kevin Mellott Fri, 30 Sep 2016 13:35:44 -0700

When you perform a .groupBy, you need to perform an aggregate immediately
afterwards.


For example:

val df1 = df.groupBy("colA").agg(sum(df1("colB")))
df1.show()

More information and examples can be found in the documentation below.

http://spark.apache.org/docs/1.6.2/api/scala/index.html#org.apache.spark.sql.DataFrame

Thanks,
Kevin

On Fri, Sep 30, 2016 at 5:46 AM, AJT <at...@currenex.com> wrote:

> I'm looking to do the following with my Spark dataframe
> (1) val df1 = df.groupBy(<long timestamp column>)
> (2) val df2 = df1.sort(<long timestamp column>)
> (3) val df3 = df2.mapPartitions(<set of aggregating functions>)
>
> I can already groupBy the column (in this case a long timestamp) - but have
> no idea how then to ensure the returned GroupedData is then sorted by the
> same timeStamp and the mapped to my set of functions
>
> Appreciate any help
> Thanks
>
>
>
> --
> View this message in context: http://apache-spark-user-list.
> 1001560.n3.nabble.com/Dataframe-Grouping-Sorting-Mapping-tp27821.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>

Re: Dataframe Grouping - Sorting - Mapping

Reply via email to