>From what I have read on Spark SQL - you need to already have a dataframe
which you can then query on - e.g. select * from myDataframe where
Where the dataframe is either a Hive table or Avro file etc.
What if you want to create a dataframe from your underlying data on the fly
with input paramet
I'm looking to do the following with my Spark dataframe
(1) val df1 = df.groupBy()
(2) val df2 = df1.sort()
(3) val df3 = df2.mapPartitions()
I can already groupBy the column (in this case a long timestamp) - but have
no idea how then to ensure the returned GroupedData is then sorted by the
same t