Hi Andriana,
Ofcourse u can sortbykey but after that when u perform mapparttion it doesnt
guarantee that 1st partition has all those eleement in order as of original
sequence..I think we need a partitioner such that it partitions the sequence
maintaining order...
Could anyone help me in defining
Splitting RDD and Grouping together to perform computation
Hi Andriana,
Thanks for suggestion.Could you please modify my code part where I need to do
so..I apologise for inconvinience ,becoz i am new to spark I coudnt apply
appropriately..i would be thankful to you.
--
View this message
Hi Andriana,
Thanks for suggestion.Could you please modify my code part where I need to
do so..I apologise for inconvinience ,becoz i am new to spark I coudnt apply
appropriately..i would be thankful to you.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/S
le you need to import
> org.apache.spark.rdd.OrderedRDDFunctions
>
> -Original Message-
> From: yh18190 [mailto:yh18...@gmail.com]
> Sent: March-28-14 5:02 PM
> To: u...@spark.incubator.apache.org
> Subject: RE: Splitting RDD and Grouping together to perform computation
>
>
> Hi,
> Here
190 [mailto:yh18...@gmail.com]
Sent: March-28-14 5:02 PM
To: u...@spark.incubator.apache.org
Subject: RE: Splitting RDD and Grouping together to perform computation
Hi,
Here is my code for given scenario.Could you please let me know where to sort?I
mean on what basis we have to sort??so that t
Hi,
Here is my code for given scenario.Could you please let me know where to
sort?I mean on what basis we have to sort??so that they maintain order in
partition as thatof original sequence..
val res2=reduced_hccg.map(_._2)// which gives RDD of numbers
res2.foreach(println)
val result= res2.ma
I think you should sort each RDD
-Original Message-
From: yh18190 [mailto:yh18...@gmail.com]
Sent: March-28-14 4:44 PM
To: u...@spark.incubator.apache.org
Subject: Re: Splitting RDD and Grouping together to perform computation
Hi,
Thanks Nanzhu.I tried to implement your suggestion on
Hi,
Thanks Nanzhu.I tried to implement your suggestion on following scenario.I
have RDD of say 24 elements.In that when i partioned into two groups of 12
elements each.Their is loss of order of elements in partition.Elemest are
partitioned randomly.I need to preserve the order such that the first 1
I didn’t group the integers, but process them in group of two,
partition that
scala> val a = sc.parallelize(List(1, 2, 3, 4), 2)
a: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at
:12
process each partition and process elements in the partition in group of 2
sc
partition your input into even number partitions
use mapPartition to operate on Iterator[Int]
maybe there are some more efficient way….
Best,
--
Nan Zhu
On Monday, March 24, 2014 at 7:59 PM, yh18190 wrote:
> Hi, I have large data set of numbers ie RDD and wanted to perform a
> comput
We need some one who can explain us with short code snippet on given example
so that we get clear cut idea on RDDs indexing..
Guys please help us
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Splitting-RDD-and-Grouping-together-to-perform-computation-tp31
I'm also interested in this.
On Mon, Mar 24, 2014 at 4:59 PM, yh18190 wrote:
> Hi, I have large data set of numbers ie RDD and wanted to perform a
> computation only on groupof two values at a time. For example
> 1,2,3,4,5,6,7... is an RDD Can i group the RDD into (1,2),(3,4),(5,6)...??
> and p
12 matches
Mail list logo