Hi Team, I was trying to understand the Salting technique for the column where there would be a huge load on a single partition because of the same keys.
I referred to one youtube video with the below understanding: So, using the salting technique we can actually change the joining column values by appending some random number in a specified range. So, suppose I have these two values in a partition of two different tables: Table A: Partition1: x . . . x Table B: Partition1: x . . . x After Salting it would be something like the below: Table A: Partition1: x_1 Partition 2: x_2 Table B: Partition1: x_3 Partition 2: x_8 Now, when I inner join these two tables after salting in order to avoid data skewness problems, I won't get a match since the keys are different after applying salting techniques. So how does this resolves the data skewness issue or if there is some understanding gap? Could anyone help me in layman's terms? TIA, Sid