This happens when you are playing around with sortByKey, mapPartition, groupBy, reduceByKey like Operations. One thing you can try is providing the number of partition (possibly > 2x number of CPUs) while doing these operations.
Thanks Best Regards On Wed, Aug 13, 2014 at 7:54 AM, Bin <wubin_phi...@126.com> wrote: > Hi All, > > I met a problem that for each stage, most workers finished fast (around > 1min), but a few workers spent like 7min to finish, which significantly > slow down the process. > > As shown below, the running time is very unbalancedly distributed over > workers. > > I wonder whether this is normal? Is it related to the partition strategy? > For now, I used the default partition strategy. > > Looking for advice! > > Thanks very much! > > Best, > Bin > > > >