What is your spark env file says? Are you setting number of executors in spark context? On 20 May 2015 13:16, "Shailesh Birari" <sbirar...@gmail.com> wrote:
> Hi, > > I have a 4 node Spark 1.3.1 cluster. All four nodes have 4 cores and 64 GB > of RAM. > I have around 600,000+ Json files on HDFS. Each file is small around 1KB in > size. Total data is around 16GB. Hadoop block size is 256MB. > My application reads these files with sc.textFile() (or sc.jsonFile() > tried > both) API. But all the files are getting read by only one node (4 > executors). Spark UI shows all 600K+ tasks on one node and 0 on other > nodes. > > I confirmed that all files are accessible from all nodes. Some other > application which uses big files uses all nodes on same cluster. > > Can you please let me know why it is behaving in such way ? > > Thanks, > Shailesh > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Job-not-using-all-nodes-in-cluster-tp22951.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >