Getting Parts of Iterables in Function's call method

2014-11-19 Thread jelgh
Hello, I run groupBy on a JavaRDD so that I get a JavaPairRDD>. If I then run for instance a reduceByKey, could I get a partitions of the grouped Iterable in the reduce function's call method? Or will I always get a full group's Iterable? If you always get a full group's Iterable, you know you

ReduceByKey but with different functions depending on key

2014-11-18 Thread jelgh
Hello everyone, I'm new to Spark and I have the following problem: I have this large JavaRDD collection, which I group with by creating a hashcode from some fields in MyClass: JavaRDD collection = ...; JavaPairRDD> grouped = collection.groupBy(...); // the group-function is just creating a hashc