Hi, One word : SKEW
It seems the classic skew problem, you would have to apply skew techniques to repartition your data properly or if you are in spark 3.0+ try the skewJoin optimization. On Tue, 26 Jan 2021 at 11:20, rajat kumar <kumar.rajat20...@gmail.com> wrote: > Hi Everyone, > > I am running a spark application where I have applied 2 left joins. 1st > join in Broadcast and another one is normal. > Out of 200 tasks , last 1 task is stuck . It is running at "ANY" Locality > level. It seems data skewness issue. > It is doing too much spill and shuffle write is too much. Following error > is coming in executor logs: > > INFO UnsafeExternalSorter: Thread spilling sort data of 10.4 GB to disk > (10 times so far) > > > Can anyone please suggest what can be wrong? > > Thanks > Rajat >