Hi,
I am working a streaming application integrated with Kafka by the API
createDirectStream. The application streams a topic which contains 10
partitions (on Kafka). It executes with 10 workers (--num-executors 10)
When it reads data from Kafka/ZooKeeper, Spark creates 10 tasks (as same
as the topic's partitions). However some executors are given more than 1
tasks and work on these tasks sequentially.
Why Spark does not distribute these 10 tasks to 10 executors? How to do
that?
Thanks,
Patcharee
- kafka streaming topic partitions vs executors patcharee
-