In word count, you don’t need much driver memory, unless you do collect, but it
is not recommended.
val file = sc.textFile("hdfs://sandbox.hortonworks.com:8020/tmp/data")
val counts = file.flatMap(line => line.split(" ")).map(word => (word,
1)).reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://sa
Hello,
it's me again.
Now I've got an explanation for the behaviour. It seems that the driver
memory is not large enough to hold the whole result set of saveAsTextFile
In-Memory. And then OOM occures. I test it with a filter-step that removes
KV-pairs with WordCount smaller 100,000. So now the job