I am reading from a kafka topic which has 8 partitions. My spark app is given
40 executors (1 core per executor). After reading the data, I repartition
the dstream by 500, map it and save it to cassandra.
However, I see that only 2 executors are being used per batch. even though I
see 500 tasks for the stage all of them are sequentially scheduled on the 2
executors picked. My spark concepts are still forming and I missing
something obvious.
I expected that 8 executors will be picked for reading data from the 8
partitions in kafka, and then with the repartition this data will be
distributed between 40 executors and then saved to cassandra.
How should I think about this?



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-executors-in-streaming-app-always-uses-2-executors-tp28413.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to