Re: JavaRDD.foreach (new VoidFunction<>...) always returns the last element

2016-07-25 Thread Jia Zou
Hi Sean, Thanks for your great help! It works all right if I remove persist!! For next step, I will transform those values before persist. I convert to RDD and back to JavaRDD just for testing purposes. Best Regards, Jia On Mon, Jul 25, 2016 at 1:01 PM, Sean Owen wrote: > Why are you converti

Re: JavaRDD.foreach (new VoidFunction<>...) always returns the last element

2016-07-25 Thread Sean Owen
Why are you converting to RDD and back to JavaRDD? The problem is storing references to Writable, which are mutated by the InputFormat. Somewhere you have 1000 refs to the same key. I think it may be the persist. You want to immediately transform these values to something besides a Writable. On Mo

JavaRDD.foreach (new VoidFunction<>...) always returns the last element

2016-07-25 Thread Jia Zou
My code is as following: System.out.println("Initialize points..."); JavaPairRDD data = sc.sequenceFile(inputFile, IntWritable.class, DoubleArrayWritable.class); RDD> rdd = JavaPairR