RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread yh18190
Hi Andriana, Ofcourse u can sortbykey but after that when u perform mapparttion it doesnt guarantee that 1st partition has all those eleement in order as of original sequence..I think we need a partitioner such that it partitions the sequence maintaining order... Could anyone help me in defining

RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread Adrian Mocanu
Splitting RDD and Grouping together to perform computation Hi Andriana, Thanks for suggestion.Could you please modify my code part where I need to do so..I apologise for inconvinience ,becoz i am new to spark I coudnt apply appropriately..i would be thankful to you. -- View this message

RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread yh18190
Hi Andriana, Thanks for suggestion.Could you please modify my code part where I need to do so..I apologise for inconvinience ,becoz i am new to spark I coudnt apply appropriately..i would be thankful to you. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/S

Re: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread Syed A. Hashmi
le you need to import > org.apache.spark.rdd.OrderedRDDFunctions > > -Original Message- > From: yh18190 [mailto:yh18...@gmail.com] > Sent: March-28-14 5:02 PM > To: u...@spark.incubator.apache.org > Subject: RE: Splitting RDD and Grouping together to perform computation > > > Hi, > Here

RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread Adrian Mocanu
190 [mailto:yh18...@gmail.com] Sent: March-28-14 5:02 PM To: u...@spark.incubator.apache.org Subject: RE: Splitting RDD and Grouping together to perform computation Hi, Here is my code for given scenario.Could you please let me know where to sort?I mean on what basis we have to sort??so that t

RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread yh18190
Hi, Here is my code for given scenario.Could you please let me know where to sort?I mean on what basis we have to sort??so that they maintain order in partition as thatof original sequence.. val res2=reduced_hccg.map(_._2)// which gives RDD of numbers res2.foreach(println) val result= res2.ma

RE: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread Adrian Mocanu
I think you should sort each RDD -Original Message- From: yh18190 [mailto:yh18...@gmail.com] Sent: March-28-14 4:44 PM To: u...@spark.incubator.apache.org Subject: Re: Splitting RDD and Grouping together to perform computation Hi, Thanks Nanzhu.I tried to implement your suggestion on

Re: Splitting RDD and Grouping together to perform computation

2014-03-28 Thread yh18190
Hi, Thanks Nanzhu.I tried to implement your suggestion on following scenario.I have RDD of say 24 elements.In that when i partioned into two groups of 12 elements each.Their is loss of order of elements in partition.Elemest are partitioned randomly.I need to preserve the order such that the first 1

Re: Splitting RDD and Grouping together to perform computation

2014-03-24 Thread Nan Zhu
I didn’t group the integers, but process them in group of two, partition that scala> val a = sc.parallelize(List(1, 2, 3, 4), 2) a: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at :12 process each partition and process elements in the partition in group of 2 sc

Re: Splitting RDD and Grouping together to perform computation

2014-03-24 Thread Nan Zhu
partition your input into even number partitions use mapPartition to operate on Iterator[Int] maybe there are some more efficient way…. Best, -- Nan Zhu On Monday, March 24, 2014 at 7:59 PM, yh18190 wrote: > Hi, I have large data set of numbers ie RDD and wanted to perform a > comput

Re: Splitting RDD and Grouping together to perform computation

2014-03-24 Thread yh18190
We need some one who can explain us with short code snippet on given example so that we get clear cut idea on RDDs indexing.. Guys please help us -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Splitting-RDD-and-Grouping-together-to-perform-computation-tp31

Re: Splitting RDD and Grouping together to perform computation

2014-03-24 Thread Walrus theCat
I'm also interested in this. On Mon, Mar 24, 2014 at 4:59 PM, yh18190 wrote: > Hi, I have large data set of numbers ie RDD and wanted to perform a > computation only on groupof two values at a time. For example > 1,2,3,4,5,6,7... is an RDD Can i group the RDD into (1,2),(3,4),(5,6)...?? > and p