I don't really know how to create JIRA :(
Specifically, the code I commented out are:
//val prediction = model.predict(test.map(_.features))
//val predictionAndLabel = prediction.zip(test.map(_.label))
//val prediction = model.predict(training.map(_.features))
//val predictionAndL
Then it may be a new issue. Do you mind creating a JIRA to track this
issue? It would be great if you can help locate the line in
BinaryClassificationMetrics that caused the problem. Thanks! -Xiangrui
On Tue, Jul 15, 2014 at 10:56 PM, crater wrote:
> I don't really have "my code", I was just runn
I don't really have "my code", I was just running example program in :
examples/src/main/scala/org/apache/spark/examples/mllib/BinaryClassification.scala
What I did was simple try this example on a 13M sparse data, and I got the
error I posted.
Today I managed to ran it after I commented out th
crater, was the error message the same as what you posted before:
14/07/14 11:32:20 ERROR TaskSchedulerImpl: Lost executor 1 on node7: remote
Akka client disassociated
14/07/14 11:32:20 WARN TaskSetManager: Lost TID 20 (task 13.0:0)
14/07/14 11:32:21 ERROR TaskSchedulerImpl: Lost executor 3 on nod
I got a bit progress. I think the problem is with the
"BinaryClassificationMetrics",
as long as I comment out all the prediction related metrics, I can run the
svm example with my data.
So the problem should be there I guess.
--
View this message in context:
http://apache-spark-user-list.1001
(1) What is "number of partitions"? Is it number of workers per node?
(2) I already set the driver memory pretty big, which is 25g.
(3) I am running Spark 1.0.1 in standalone cluster with 9 nodes, 1 one them
works as master, others are workers.
--
View this message in context:
http://apache-s
I am running Spark 1.0.1 on a 5 node yarn cluster. I have set the
driver memory to 8G and executor memory to about 12G.
Regards,
Krishna
On Mon, Jul 14, 2014 at 5:56 PM, Xiangrui Meng wrote:
> Is it on a standalone server? There are several settings worthing checking:
>
> 1) number of partition
Is it on a standalone server? There are several settings worthing checking:
1) number of partitions, which should match the number of cores
2) driver memory (you can see it from the executor tab of the Spark
WebUI and set it with "--driver-memory 10g"
3) the version of Spark you were running
Best
That is exactly the same error that I got. I am still having no success.
Regards,
Krishna
On Mon, Jul 14, 2014 at 11:50 AM, crater wrote:
> Hi Krishna,
>
> Thanks for your help. Are you able to get your 29M data running yet? I fix
> the previous problem by setting larger spark.akka.frameSize, bu
Hi Krishna,
Thanks for your help. Are you able to get your 29M data running yet? I fix
the previous problem by setting larger spark.akka.frameSize, but now I get
some other errors below. Did you get these errors before?
14/07/14 11:32:20 ERROR TaskSchedulerImpl: Lost executor 1 on node7: remote
If you use Scala, you can do:
val conf = new SparkConf()
.setMaster("yarn-client")
.setAppName("Logistic regression SGD fixed")
.set("spark.akka.frameSize", "100")
.setExecutorEnv("SPARK_JAVA_OPTS", " -Dspark.akka.frameSize=100")
var sc = n
Hi xiangrui,
Where can I set the "spark.akka.frameSize" ?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Error-when-testing-with-large-sparse-svm-tp9592p9616.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
You need to set a larger `spark.akka.frameSize`, e.g., 128, for the
serialized weight vector. There is a JIRA about switching
automatically between sending through akka or broadcast:
https://issues.apache.org/jira/browse/SPARK-2361 . -Xiangrui
On Mon, Jul 14, 2014 at 12:15 AM, crater wrote:
> Hi,
13 matches
Mail list logo