Re: groupByKey() and keys with many values

2015-09-07 Thread kaklakariada
Hi Antonio! Thank you very much for your answer! You are right in that in my case the computation could be replaced by a reduceByKey. The thing is that my computation also involves database queries: 1. Fetch key-specific data from database into memory. This is expensive and I only want to do this

groupByKey() and keys with many values

2015-09-07 Thread kaklakariada
Hi, I already posted this question on the users mailing list (http://apache-spark-user-list.1001560.n3.nabble.com/Using-groupByKey-with-many-values-per-key-td24538.html) but did not get a reply. Maybe this is the correct forum to ask. My problem is, that doing groupByKey().mapToPair() loads all v