The keyBy argument function is a deterministic function under same
MaxParallelism, which make sure the key group is always same for same key.
My goal here is making keys distributed evenly among operators even with
different parallelism. I implement the mapping in a different way by exhausting
Itzchakov
Sent: Donnerstag, 4. November 2021 08:25
To: naitong Xiao
Cc: user
Subject: Re: Custom partitioning of keys with keyBy
Thank you Schwalbe, David and Naitong for your answers!
David: This is what we're currently doing ATM, and I wanted to know if there's
any simplified approach to
Thank you Schwalbe, David and Naitong for your answers!
*David*: This is what we're currently doing ATM, and I wanted to know if
there's any simplified approach to this. This is what we have so far:
https://gist.github.com/YuvalItzchakov/9441a4a0e80609e534e69804e94cb57b
*Naitong*: The keyBy intern
I think I had a similar scenario several months ago, here is my related code:
val MAX_PARALLELISM = 16
val KEY_RAND_SALT = “73b46”
logSource.keyBy{ value =>
val keyGroup = KeyGroupRangeAssignment.assignToKeyGroup(value.deviceUdid,
MAX_PARALLELISM)
s"$KEY_RAND_SALT$keyGroup"
}
The keyGroup is
ere
> you actually want to run in parallelism higher than the number of
> different
> ‘small’ keys
>
>
>
> Hope this helps
>
>
>
> Thias
>
>
>
>
>
> *From:* Yuval Itzchakov
> *Sent:* Mittwoch, 3. November 2021 14:41
> *To:* user
> *
partitioning of keys with keyBy
Hi,
I have a use-case where I'd like to partition a KeyedDataStream a bit
differently than how Flinks default partitioning works with key groups.
[cid:image001.png@01D7D0C8.69E83060]
What I'd like to be able to do is take all my data and split it up evenl