Hi all, I am running a simple analysis using Spark streaming. I set executor number and default parallelism both as 300. The program consumes data from Kafka and do a simple groupBy operation with 300 as the parameter. The batch size is one minute. In the first two batches, there are around 50 executors. However, after the first two batches, there are always 2 executors for the groupBy operation, which makes it run very slowly.
Does anyone has an idea why only 2 executors are assigned for this operation? Thanks! Bill