I'm trying to switch from RDD API to Dataset API My question is about reduceByKey method
e.g. in the following example I'm trying to rewrite sc.parallelize(Seq(1->2, 1->5, 3->6)).reduceByKey(math.max).take(10) using DS API. That is what I have so far: Seq(1->2, 1->5, 3->6).toDS.groupBy(_._1).agg(max($"_2").as(ExpressionEncoder[Int])).take(10) Questions: 1. is it possible to avoid typing "as(ExpressionEncoder[Int])" or replace it with smth shorter? 2. Why I have to use String column name in max function? e.g. $"_2" or col("_2"). can I use _._2 instead? Alex