Spark SQL query

2016-10-06 Thread AJT
>From what I have read on Spark SQL - you need to already have a dataframe which you can then query on - e.g. select * from myDataframe where Where the dataframe is either a Hive table or Avro file etc. What if you want to create a dataframe from your underlying data on the fly with input paramet

Dataframe Grouping - Sorting - Mapping

2016-09-30 Thread AJT
I'm looking to do the following with my Spark dataframe (1) val df1 = df.groupBy() (2) val df2 = df1.sort() (3) val df3 = df2.mapPartitions() I can already groupBy the column (in this case a long timestamp) - but have no idea how then to ensure the returned GroupedData is then sorted by the same t