Hi Patrick,

In this particular case, at the end of my tasks I have X different types of
keys. I need to write their values to X different files respectively. For
now I'm writing everything to the driver node's local FS.

While the number of key-value pairs can grow to millions (billions?), X is
more or less fixed at 25-30. A groupByKey followed by a map(x:
Iterable[Value] => x.foreach(destination.write(x)) would be great. But then
again, I'm not too sure about serialization issues and more likely that not
this idea would fail, but I'll try it out.

So the toLocalIterator implementation works OK for me here, though it might
turn out to be slow.

Cheers,
Nilesh

PS: Can't wait for 1.0! ^_^ Looks like it's been RC10 till now.



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/all-values-for-a-key-must-fit-in-memory-tp6342p6796.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

Reply via email to