The dreaded bradcast error Error: Failed to get broadcast_0_piece0 of broadcast_0

2015-03-25 Thread rkgurram
val transArray:RDD[Flow] < Flow is my custom class, it has the following methods getFlowStart() <---returns a Double (start time) getFlowEnd() <---returns a Double (end time)

Re: Naive Bayes model fails after a few predictions

2015-02-10 Thread rkgurram
Further I have tried HttpBroadcast but that too does not work. It is almost like there is a MemoryLeak because if I increase the input files to "500" instead of "200" the system crashes early. The code is as follows logger.info("Training the model Fold:["+ fold +"]"

Naive Bayes model fails after a few predictions

2015-02-10 Thread rkgurram
Hi, I have built a "Sentiment Analyzer" using the Naive Bayes model, the model works fine by learning from a list of 200 movie reviews and correctly predicting with an accuracy of close to 77% to 80%. After a while of predicting I get the following stacktrace... By the way...I have only one

Spark does not loop through a RDD.map

2015-01-12 Thread rkgurram
Hi, I am observing some weird behavior with spark, it might be my mis-interpretation of some fundamental concepts but I have look at it for 3 days and have not been able to solve it. The source code is pretty long and complex so instead of posting it, I will try to articulate the problem. I am

Re: How to merge a RDD of RDDs into one uber RDD

2015-01-07 Thread rkgurram
Thank you for the response, sure will try that out. Currently I changed my code such that the first map "files.map" to "files.flatMap", which I guess will do similar what you are saying, it gives me a List[] of elements (in this case LabeledPoints, I could also do RDDs) which I then turned into a