Why are you converting to RDD and back to JavaRDD?
The problem is storing references to Writable, which are mutated by the
InputFormat. Somewhere you have 1000 refs to the same key. I think it may
be the persist. You want to immediately transform these values to something
besides a Writable.

On Mon, Jul 25, 2016, 18:50 Jia Zou <jacqueline...@gmail.com> wrote:

>
> My code is as following:
>
>                 System.out.println("Initialize points...");
>
>                 JavaPairRDD<IntWritable, DoubleArrayWritable> data =
>
>                                 sc.sequenceFile(inputFile, IntWritable.
> class, DoubleArrayWritable.class);
>
>                 RDD<Tuple2<IntWritable, DoubleArrayWritable>> rdd =
>
>                                 JavaPairRDD.toRDD(data);
>
>                 JavaRDD<Tuple2<IntWritable, DoubleArrayWritable>> points =
> JavaRDD.fromRDD(rdd, data.classTag());
>
>                 points.persist(StorageLevel.MEMORY_ONLY());
>
>                 int i;
>
>
>               for (i=0; i<iterations; i++) {
>
>                         System.out.println("iteration="+i);
>
>                         //points.foreach(new
> ForEachMapPointToCluster(numDimensions, numClusters));
>
>                         points.foreach(new
> VoidFunction<Tuple2<IntWritable, DoubleArrayWritable>>() {
>
>                             public void call(Tuple2<IntWritable,
> DoubleArrayWritable> tuple) {
>
>                                 IntWritable key = tuple._1();
>
>                                 System.out.println("key:"+key.get());
>
>                                 DoubleArrayWritable array = tuple._2();
>
>                                 double[] point = array.getData();
>
>                                 for (int d = 0; d < 20; d ++) {
>
>                                     System.out.println(d+":"+point[d]);
>
>                                 }
>
>                             }
>
>                         });
>
>                 }
>
>
> The output is a lot of following, only the last element in the rdd has
> been output.
>
> key:999
>
> 0:0.9953839426689233
>
> 1:0.12656798341145892
>
> 2:0.16621114723289654
>
> 3:0.48628049787614236
>
> 4:0.476991470215116
>
> 5:0.5033640235789054
>
> 6:0.09257098597507829
>
> 7:0.3153088440494892
>
> 8:0.8807426085223242
>
> 9:0.2809625780570739
>
> 10:0.9584880094505738
>
> 11:0.38521222520661547
>
> 12:0.5114241334425228
>
> 13:0.9524628903835111
>
> 14:0.5252549496842003
>
> 15:0.5732037830866236
>
> 16:0.8632451606583632
>
> 17:0.39754347061499895
>
> 18:0.2859522809981715
>
> 19:0.2659002343432888
>
> key:999
>
> 0:0.9953839426689233
>
> 1:0.12656798341145892
>
> 2:0.16621114723289654
>
> 3:0.48628049787614236
>
> 4:0.476991470215116
>
> 5:0.5033640235789054
>
> 6:0.09257098597507829
>
> 7:0.3153088440494892
>
> 8:0.8807426085223242
>
> 9:0.2809625780570739
>
> 10:0.9584880094505738
>
> 11:0.38521222520661547
>
> 12:0.5114241334425228
>
> 13:0.9524628903835111
>
> 14:0.5252549496842003
>
> 15:0.5732037830866236
>
> 16:0.8632451606583632
>
> 17:0.39754347061499895
>
> 18:0.2859522809981715
>
> 19:0.2659002343432888
>
> key:999
>
> 0:0.9953839426689233
>
> 1:0.12656798341145892
>
> 2:0.16621114723289654
>
> 3:0.48628049787614236
>
> 4:0.476991470215116
>
> 5:0.5033640235789054
>
> 6:0.09257098597507829
>
> 7:0.3153088440494892
>
> 8:0.8807426085223242
>
> 9:0.2809625780570739
>
> 10:0.9584880094505738
>
> 11:0.38521222520661547
>
> 12:0.5114241334425228
>
> 13:0.9524628903835111
>
> 14:0.5252549496842003
>
> 15:0.5732037830866236
>
> 16:0.8632451606583632
>
> 17:0.39754347061499895
>
> 18:0.2859522809981715
>
> 19:0.2659002343432888
>

Reply via email to