val transArray:RDD[Flow] < Flow is my custom class, it has the following
methods
getFlowStart() <---returns a
Double (start time)
getFlowEnd() <---returns a
Double (end time)
Further I have tried HttpBroadcast but that too does not work.
It is almost like there is a MemoryLeak because if I increase the input
files to "500" instead of "200" the system crashes early.
The code is as follows
logger.info("Training the model Fold:["+ fold +"]"
Hi,
I have built a "Sentiment Analyzer" using the Naive Bayes model, the
model works fine by learning from a list of 200 movie reviews and correctly
predicting with an accuracy of close to 77% to 80%.
After a while of predicting I get the following stacktrace...
By the way...I have only one
Hi,
I am observing some weird behavior with spark, it might be my
mis-interpretation of some fundamental concepts but I have look at it for 3
days and have not been able to solve it.
The source code is pretty long and complex so instead of posting it, I will
try to articulate the problem.
I am
Thank you for the response, sure will try that out.
Currently I changed my code such that the first map "files.map" to
"files.flatMap", which I guess will do similar what you are saying, it gives
me a List[] of elements (in this case LabeledPoints, I could also do RDDs)
which I then turned into a