Sure, just add ".toList.sorted" in there. Putting together in one big
expression:
val rdd = sc.parallelize(List(List(1,2,3,4,5),List(6,7,8,9,10)))
val result =
rdd.flatMap(_.zipWithIndex).groupBy(_._2).values.map(_.map(_._1).toList.sorted)
List(2, 7)
List(1, 6)
List(4, 9)
List(3, 8)
List(5, 10)
, 13, 18, 8)
ArrayBuffer(9, 19, 4, 14)
ArrayBuffer(15, 20, 10, 5)
It collects well but the order is shuffled.
Can I maintain the order?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11974.html
Sent from the Apache Spark User List
this without using collect method because realworld
> RDD can have a lot of elements then it may cause out of memory.
> Any ideas will be welcome.
>
> Best regards
> Kevin
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.
.
This problem is related to pivot table.
Thanks
Kevin
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11957.html
Sent from the Apache Spark User List mailing list archive at Nabble.com
> Best regards
> Kevin
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
method because realworld
RDD can have a lot of elements then it may cause out of memory.
Any ideas will be welcome.
Best regards
Kevin
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html
Sent from the Apache Spark User List mailing