Is data being cached? It might be that those two nodes started first and did the first pass of the data, so it's all on them. It's kind of ugly but you can add a Thread.sleep when your program starts to wait for nodes to come up.
Also, have you checked the applicatio web UI at http://<driver node>:4040 while the app is running? It shows details of where each task and where each partition of data is, which might show that e.g. some tasks are much longer than others due to data skew, or stuff like that. Matei On July 29, 2014 at 10:13:14 AM, rpandya (r...@iecommerce.com) wrote: OK, I did figure this out. I was running the app (avocado) using spark-submit, when it was actually designed to take command line arguments to connect to a spark cluster. Since I didn't provide any such arguments, it started a nested local Spark cluster *inside* the YARN Spark executor and so of course everything ran on one node. If I spin up a Spark cluster manually and provide the spark master URI to avocado, it works fine. Now, I've tried running a reasonable-sized job through (400GB of data on 10 HDFS/Spark nodes), and the partitioning is strange. Eight nodes get almost nothing, and the other two nodes each get half the work. This happens whether I use coalesce with shuffle=true or false before the work stage. (Though if I use shuffle=true, it creates 3000 tasks to do the shuffle, and still ends up with this skewed distribution!) Any suggestions on how to figure out what's going on? Thanks, Ravi -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Memory-compute-intensive-tasks-tp9643p10868.html Sent from the Apache Spark User List mailing list archive at Nabble.com.