To the original question of parallelism and executors: you can have a parallelism of 200, even with 2 executors. In the Spark UI, you should see that the number of _tasks_ is 200 when your job involves shuffling.
Executors vs. tasks: http://spark.apache.org/docs/latest/cluster-overview.html Xinh On Mon, May 23, 2016 at 5:48 AM, Mathieu Longtin <[email protected]> wrote: > Since the default is 200, I would guess you're only running 2 executors. > Try to verify how many executor you are actually running with the web > interface (port 8080 where the master is running). > > On Sat, May 21, 2016 at 11:42 PM Ted Yu <[email protected]> wrote: > >> Looks like an equal sign is missing between partitions and 200. >> >> On Sat, May 21, 2016 at 8:31 PM, SRK <[email protected]> wrote: >> >>> Hi, >>> >>> How to set the degree of parallelism in Spark SQL? I am using the >>> following >>> but it somehow seems to allocate only two executors at a time. >>> >>> sqlContext.sql(" set spark.sql.shuffle.partitions 200 ") >>> >>> Thanks, >>> Swetha >>> >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/How-to-set-the-degree-of-parallelism-in-Spark-SQL-tp26996.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> -- > Mathieu Longtin > 1-514-803-8977 >
