Re: Custom partitioning of keys with keyBy

2021-11-04 Thread naitong Xiao
The keyBy argument function is a deterministic function under same MaxParallelism,  which make sure the key group is always same for same key. My goal here is making keys distributed evenly among operators even with different parallelism. I implement the mapping in a different way by exhausting

RE: Custom partitioning of keys with keyBy

2021-11-04 Thread Schwalbe Matthias
Itzchakov Sent: Donnerstag, 4. November 2021 08:25 To: naitong Xiao Cc: user Subject: Re: Custom partitioning of keys with keyBy Thank you Schwalbe, David and Naitong for your answers! David: This is what we're currently doing ATM, and I wanted to know if there's any simplified approach to

Re: Custom partitioning of keys with keyBy

2021-11-04 Thread Yuval Itzchakov
Thank you Schwalbe, David and Naitong for your answers! *David*: This is what we're currently doing ATM, and I wanted to know if there's any simplified approach to this. This is what we have so far: https://gist.github.com/YuvalItzchakov/9441a4a0e80609e534e69804e94cb57b *Naitong*: The keyBy intern

Re: Custom partitioning of keys with keyBy

2021-11-03 Thread naitong Xiao
I think I had a similar scenario several months ago, here is my related code: val MAX_PARALLELISM = 16 val KEY_RAND_SALT = “73b46” logSource.keyBy{ value =>  val keyGroup = KeyGroupRangeAssignment.assignToKeyGroup(value.deviceUdid, MAX_PARALLELISM)  s"$KEY_RAND_SALT$keyGroup" } The keyGroup is

Re: Custom partitioning of keys with keyBy

2021-11-03 Thread David Anderson
ere > you actually want to run in parallelism higher than the number of > different > ‘small’ keys > > > > Hope this helps > > > > Thias > > > > > > *From:* Yuval Itzchakov > *Sent:* Mittwoch, 3. November 2021 14:41 > *To:* user > *

RE: Custom partitioning of keys with keyBy

2021-11-03 Thread Schwalbe Matthias
partitioning of keys with keyBy Hi, I have a use-case where I'd like to partition a KeyedDataStream a bit differently than how Flinks default partitioning works with key groups. [cid:image001.png@01D7D0C8.69E83060] What I'd like to be able to do is take all my data and split it up evenl