There were recently some fantastic talks about this at the SparkSummit conference in San Francisco. I suggest you check out the SparkSummit YouTube channel after May 9th for a deep dive into this topic.
From: rajat kumar <kumar.rajat20...@gmail.com> Date: Monday, April 29, 2019 at 9:34 AM To: "user@spark.apache.org" <user@spark.apache.org> Subject: [EXT] handling skewness issues Hi All, How to overcome skewness issues in spark ? I read that we can add some randomness to key column before join and remove that random part after join. is there any better way ? Above method seems to be a workaround. thanks rajat