Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-23 Thread jaeholee
So if I am using GraphX on Spark and I created a graph, which gets called a lot later, do I want to cache graph? Or do I want to cache the vertices and edges (actual data) that I use to create the graph? e.g. val graph = Graph(vertices, edges) graph.blahblahblah graph.blahblahblah graph.blahblahb

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-23 Thread jaeholee
After doing that, I ran my code once with a smaller example, and it worked. But ever since then, I get the "No space left on device" message for the same sample, even if I re-start the master... ERROR TaskSetManager: Task 29.0:20 failed 4 times; aborting job org.apache.spark.SparkException: Job ab

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-22 Thread jaeholee
Ok. I tried setting the partition number to 128 and numbers greater than 128, and now I get another error message about "Java heap space". Is it possible that there is something wrong with the setup of my Spark cluster to begin with? Or is it still an issue with partitioning my data? Or do I just n

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-22 Thread jaeholee
How do you determine the number of partitions? For example, I have 16 workers, and the number of cores and the worker memory set in spark-env.sh are: CORE = 8 MEMORY = 16g The .csv data I have is about 500MB, but I am eventually going to use a file that is about 15GB. Is the MEMORY variable in s

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-22 Thread jaeholee
Spark is running fine, but I get this message. Does this mean that my data is just too big? 14/04/22 17:06:20 ERROR TaskSchedulerImpl: Lost executor 2 on WORKER#2: OutOfMemoryError 14/04/22 17:06:20 ERROR TaskSetManager: Task 550.0:2 failed 4 times; aborting job org.apache.spark.SparkException

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-22 Thread jaeholee
wow! it worked! thank you so much! so now, all I need to do is to put the number of workers that I want to use when I read the data right? e.g. val numWorkers = 10 val data = sc.textFile("somedirectory/data.csv", numWorkers) -- View this message in context: http://apache-spark-user-list.10015

Re: ERROR TaskSchedulerImpl: Lost an executor

2014-04-22 Thread jaeholee
No, I am not using the aws. I am using one of the national lab's cluster. But as I mentioned, I am pretty new to computer science, so I might not be answering your question right... but 7077 is accessible. Maybe I got it wrong from the get-go? I will just write down what I did... Basically I have

ERROR TaskSchedulerImpl: Lost an executor

2014-04-21 Thread jaeholee
Hi, I am trying to set up my own standalone Spark, and I started the master node and worker nodes. Then I ran ./bin/spark-shell, and I get this message: 14/04/21 16:31:51 ERROR TaskSchedulerImpl: Lost an executor 1 (already removed): remote Akka client disassociated 14/04/21 16:31:51 ERROR TaskSch