Re: Thread spilling sort issue with single task

2021-01-26 Thread German Schiavon
Well if your data is skewed I don't think it can be avoided but mitigated using skew techniques. I'd recommend you to take a look at "salted join" maybe. On Tue, 26 Jan 2021 at 11:29, rajat kumar wrote: > Hi , > > Yes I understand its skew based problem but how can it be avoided . Could > you

Re: Thread spilling sort issue with single task

2021-01-26 Thread rajat kumar
Hi , Yes I understand its skew based problem but how can it be avoided . Could you please suggest? I am in Spark2.4 Thanks Rajat On Tue, Jan 26, 2021 at 3:58 PM German Schiavon wrote: > Hi, > > One word : SKEW > > It seems the classic skew problem, you would have to apply skew techniques > to

Re: Thread spilling sort issue with single task

2021-01-26 Thread German Schiavon
Hi, One word : SKEW It seems the classic skew problem, you would have to apply skew techniques to repartition your data properly or if you are in spark 3.0+ try the skewJoin optimization. On Tue, 26 Jan 2021 at 11:20, rajat kumar wrote: > Hi Everyone, > > I am running a spark application where