Is it because countByValue or toArray put too much stress on the driver, if
there are many unique words
To me it is a typical word count problem, then you can solve it as follows
(correct me if I am wrong)
val textFile = sc.textFile(“file")
val counts = textFile.flatMap(line => line.split(" "))
The implementation of SVDPlusPlus shows that it produces two new graph in
each iteration which will also be cached to memory. However, as the
iteration goes on, more and more graph will be cached and out of memory
happens. So I think it maybe need to unpersist old graph which will not be
used any m
The job ended up running overnight with no progress. :-(
On Sat, Aug 16, 2014 at 12:16 AM, Jerry Ye wrote:
> Hi Xiangrui,
> I actually tried branch-1.1 and master and it resulted in the job being
> stuck at the TaskSetManager:
> 14/08/16 06:55:48 INFO scheduler.TaskSchedulerImpl: Adding task se
Thanks for the clarification. I don¹t have a deep knowledge of Scala, but I
thought this was going to be reasonable to support since Java serialization
framework provides relatively easy ways to support these kinds of backwards
compatibility. I can see how this could be harder with closures.
Suppo
thanks Manu..
For me, the sample app works only in 'local' mode.
If I tried to connect a spark cluster (even one running locally :
spark://localhost:7077) I get the following error
spark.master=spark://localhost:7077
[error] o.a.s.s.c.SparkDeploySchedulerBackend - Application has been
killed. Re