I have a use case similar to this: http://stackoverflow.com/questions/33878370/spark-dataframe-select-the-first-row-of-each-group
and I'm trying to understand the solution titled "ordering over structs": 1) Is a struct in Spark like a struct in C++? 2) What is an alias in this context? 3) How does this code even work? 4) Is it faster doing it this way than doing a join or window function in Spark SQL? val dfTop = df.select($"Hour", struct($"TotalValue", $"Category").alias("vs")) .groupBy($"hour") .agg(max("vs").alias("vs")) .select($"Hour", $"vs.Category", $"vs.TotalValue") thank you, imran