Re: skewed data in join

Anis Nasir Thu, 16 Feb 2017 08:54:42 -0800

You can also so something similar to what is mentioned in [1].

The basic idea is to use two hash functions for each key and assigning it
to the least loaded of the two hashed worker.


Cheers,
Anis


[1].
https://melmeric.files.wordpress.com/2014/11/the-power-of-both-choices-practical-load-balancing-for-distributed-stream-processing-engines.pdf


On Fri, 17 Feb 2017 at 01:34, Yong Zhang <[email protected]> wrote:

> Yes. You have to change your key, or as BigData term, "adding salt".
>
>
> Yong
>
> ------------------------------
> *From:* Gourav Sengupta <[email protected]>
> *Sent:* Thursday, February 16, 2017 11:11 AM
> *To:* user
> *Subject:* skewed data in join
>
> Hi,
>
> Is there a way to do multiple reducers for joining on skewed data?
>
> Regards,
> Gourav
>

Re: skewed data in join

Reply via email to