The number of executors is set when you launch the shell or an application with /spark-submit/. It's controlled by the /num-executors/ parameter: https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/.
Important is also that cranking up the number may not cause your queries to run faster. If you set it to, let's say 200, but you only have 10 cores divided over 5 nodes, then you may not see a significant speed-up beyond 5-10 executors. You may want to check out Cloudera's tuning guide: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-the-degree-of-parallelism-in-Spark-SQL-tp26996p27031.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
