Map the key value into a key,Tuple2<key,value> and process that -
Also ask the Spark maintainers for a version of keyed operations where the
key is passed in as an argument - I run into these cases all the time
/**
* map a tuple int a key tuple pair to insure subsequent processing has
access to both Key and value
* @param inp input pair RDD
* @param <K> key type
* @param <V> value type
* @return output where value has both key and value
*/
@Nonnull
public static <K extends Serializable, V extends Serializable>
JavaPairRDD<K,Tuple2<K, V>> toKeyedTuples(@Nonnull JavaPairRDD< K, V>
inp) {
return inp.flatMapToPair(new PairFlatMapFunction<Tuple2<K,
V>, K, Tuple2<K, V>>() {
@Override
public Iterable<Tuple2<K, Tuple2<K, V>>> call(final
Tuple2<K, V> t) throws Exception {
return new Tuple2<K, Tuple2<K, V>>>(t._1(),new
Tuple2<K,V>(t._1(),t._2());
}
});
}
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/ReduceByKey-but-with-different-functions-depending-on-key-tp19177p19198.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]