Re: Transform RDD[List]

2014-08-12 Thread Sean Owen
Sure, just add ".toList.sorted" in there. Putting together in one big expression: val rdd = sc.parallelize(List(List(1,2,3,4,5),List(6,7,8,9,10))) val result = rdd.flatMap(_.zipWithIndex).groupBy(_._2).values.map(_.map(_._1).toList.sorted) List(2, 7) List(1, 6) List(4, 9) List(3, 8) List(5, 10)

Re: Transform RDD[List]

2014-08-12 Thread Kevin Jung
, 13, 18, 8) ArrayBuffer(9, 19, 4, 14) ArrayBuffer(15, 20, 10, 5) It collects well but the order is shuffled. Can I maintain the order? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11974.html Sent from the Apache Spark User List

Re: Transform RDD[List]

2014-08-12 Thread Sean Owen
this without using collect method because realworld > RDD can have a lot of elements then it may cause out of memory. > Any ideas will be welcome. > > Best regards > Kevin > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.

Re: Transform RDD[List]

2014-08-11 Thread Kevin Jung
. This problem is related to pivot table. Thanks Kevin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948p11957.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Transform RDD[List]

2014-08-11 Thread Soumya Simanta
> Best regards > Kevin > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > -

Transform RDD[List]

2014-08-11 Thread Kevin Jung
method because realworld RDD can have a lot of elements then it may cause out of memory. Any ideas will be welcome. Best regards Kevin -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Transform-RDD-List-tp11948.html Sent from the Apache Spark User List mailing