Hi

i have a javaPairRdd<K,V> rdd1. i want to group by rdd1 by keys but
preserve the partitions of original rdd only to avoid shuffle since I know
all same keys are already in same partition.

PairRdd is basically constrcuted using kafka streaming low level consumer
which have all records with same key already in same partition. Can i group
them together with avoid shuffle.

Thanks

Reply via email to