Hi Everyone, I am running a spark application where I have applied 2 left joins. 1st join in Broadcast and another one is normal. Out of 200 tasks , last 1 task is stuck . It is running at "ANY" Locality level. It seems data skewness issue. It is doing too much spill and shuffle write is too much. Following error is coming in executor logs:
INFO UnsafeExternalSorter: Thread spilling sort data of 10.4 GB to disk (10 times so far) Can anyone please suggest what can be wrong? Thanks Rajat