Hi Antonio!
Thank you very much for your answer!
You are right in that in my case the computation could be replaced by a
reduceByKey. The thing is that my computation also involves database
queries:
1. Fetch key-specific data from database into memory. This is expensive and
I only want to do this
Hi,
I already posted this question on the users mailing list
(http://apache-spark-user-list.1001560.n3.nabble.com/Using-groupByKey-with-many-values-per-key-td24538.html)
but did not get a reply. Maybe this is the correct forum to ask.
My problem is, that doing groupByKey().mapToPair() loads all v