Re: Spark Job not using all nodes in cluster

ayan guha Tue, 19 May 2015 22:13:21 -0700

What is your spark env file says? Are you setting number of executors in
spark context?
On 20 May 2015 13:16, "Shailesh Birari" <sbirar...@gmail.com> wrote:


> Hi,
>
> I have a 4 node Spark 1.3.1 cluster. All four nodes have 4 cores and 64 GB
> of RAM.
> I have around 600,000+ Json files on HDFS. Each file is small around 1KB in
> size. Total data is around 16GB. Hadoop block size is 256MB.
> My application reads these files with sc.textFile() (or sc.jsonFile()
> tried
> both) API. But all the files are getting read by only one node (4
> executors). Spark UI shows all 600K+ tasks on one node and 0 on other
> nodes.
>
> I confirmed that all files are accessible from all nodes. Some other
> application which uses big files uses all nodes on same cluster.
>
> Can you please let me know why it is behaving in such way ?
>
> Thanks,
>   Shailesh
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Job-not-using-all-nodes-in-cluster-tp22951.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: Spark Job not using all nodes in cluster

Reply via email to