from:"motte1988"

basic streaming question

2014-09-19 Thread motte1988

Hello everybody, I'm new to spark streaming and played a bit around with WordCount and a PageRank-Algorithm in a cluster-environment. Am I right, that in the cluster each executor computes data stream separately? And that the result of each executor is independent of the other executors? In the

Re: Running Wordcount on large file stucks and throws OOM exception

2014-08-26 Thread motte1988

Hello, it's me again. Now I've got an explanation for the behaviour. It seems that the driver memory is not large enough to hold the whole result set of saveAsTextFile In-Memory. And then OOM occures. I test it with a filter-step that removes KV-pairs with WordCount smaller 100,000. So now the job