Re: All executors run on just a few nodes

2014-10-19 Thread raymond
when the data’s source host is not one of the registered executors, it will also be marked as PROCESS_LOCAL too, though it should have a different NAME for this. I don’t know did someone change this name very recently. but for 0.9, it is the case . When I say satisfy, yes, if the executors hav

Re: All executors run on just a few nodes

2014-10-19 Thread Tao Xiao
Raymond, Thank you. But I read from other thread that "PROCESS_LOCAL" means the data is in the same JVM as the code that is running. When data is in the same JVM

Re: All executors run on just a few nodes

2014-10-19 Thread raymond
My best guess is the speed that your executors got registered with driver differs between each run. when you run it for the first time, the executors is not fully registered when task set manager start to assign tasks, and thus the tasks was assigned to available executors which have already sa

All executors run on just a few nodes

2014-10-19 Thread Tao Xiao
Hi all, I have a Spark-0.9 cluster, which has 16 nodes. I wrote a Spark application to read data from an HBase table, which has 86 regions spreading over 20 RegionServers. I submitted the Spark app in Spark standalone mode and found that there were 86 executors running on just 3 nodes and it too