I don't really know how to create JIRA :(
Specifically, the code I commented out are:
//val prediction = model.predict(test.map(_.features))
//val predictionAndLabel = prediction.zip(test.map(_.label))
//val prediction = model.predict(training.map(_.features))
//val predictionAndL
I don't really have "my code", I was just running example program in :
examples/src/main/scala/org/apache/spark/examples/mllib/BinaryClassification.scala
What I did was simple try this example on a 13M sparse data, and I got the
error I posted.
Today I managed to ran it after I commented out th
I got a bit progress. I think the problem is with the
"BinaryClassificationMetrics",
as long as I comment out all the prediction related metrics, I can run the
svm example with my data.
So the problem should be there I guess.
--
View this message in context:
http://apache-spark-user-list.1001
(1) What is "number of partitions"? Is it number of workers per node?
(2) I already set the driver memory pretty big, which is 25g.
(3) I am running Spark 1.0.1 in standalone cluster with 9 nodes, 1 one them
works as master, others are workers.
--
View this message in context:
http://apache-s
Hi Krishna,
Thanks for your help. Are you able to get your 29M data running yet? I fix
the previous problem by setting larger spark.akka.frameSize, but now I get
some other errors below. Did you get these errors before?
14/07/14 11:32:20 ERROR TaskSchedulerImpl: Lost executor 1 on node7: remote
Hi xiangrui,
Where can I set the "spark.akka.frameSize" ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Error-when-testing-with-large-sparse-svm-tp9592p9616.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
Hi,
I encounter an error when testing svm (example one) on very large sparse
data. The dataset I ran on was a toy dataset with only ten examples but 13
million sparse vector with a few thousands non-zero entries.
The errors is showing below. I am wondering is this a bug or I am missing
something?
Hi Xiangrui,
Thanks for the information. Also, it is possible to figure out the execution
time per iteration for SVM?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Putting-block-rdd-failed-when-running-example-svm-on-large-data-tp9515p9535.html
Sent from
Hi,
I am trying to run the example BinaryClassification
(org.apache.spark.examples.mllib.BinaryClassification) on a 202G file. I am
constantly getting the messages looks like below, it is normal or I am
missing something.
14/07/12 09:49:04 WARN BlockManager: Block rdd_4_196 could not be dropped
f