I'm doing something like this: rdd.groupBy.map().collect()
The work load on final map is pretty much evenly distributed. When collect happens, say on 60 partitions, the first 55 or so partitions finish very quickly say within 10 seconds. However, the last 5, particularly the very last one, typically get very slow, the overall collect time reaching 30 seconds to sometimes even 1 minute. E.g., it would get stuck in a state like 54/55 for a much longer time. Another interesting thing is the first iteration typically doesn't have this problem, but it gets progressively worse despite having about the same workload/partition sizes in subsequent iterations. This problem worsens with smaller akka framesize and/or maxMbInFlight Anyone know why this is so?