Re: Task Assignment

2020-04-27 Thread Piotr Nowojski
Hi Navneeth, But what’s the problem with using `keyBy(…)`? If you have a set of keys that you want to process together, in other words they are are basically equal from the `keyBy(…)` perspective, why can’t you use this in your `KeySelector`? Maybe to make it clear, you can think about this in

Re: Task Assignment

2020-04-27 Thread Marta Paes Moreira
Sorry — I didn't understand you were dealing with multiple keys. In that case, I'd recommend you read about key-group assignment [1] and check the KeyGroupRangeAssignment class [2]. Key-groups are assigned to parallel tasks as ranges before the job is started — this is also a well-defined behavio

Re: Task Assignment

2020-04-23 Thread Navneeth Krishnan
Hi Marta, Thanks for you response. What I'm looking for is something like data localization. If I have one TM which is processing a set of keys, I want to ensure all keys of the same type goes to the same TM rather than using hashing to find the downstream slot. I could use a common key to do this

Re: Task Assignment

2020-04-23 Thread Marta Paes Moreira
Hi, Navneeth. If you *key* your stream using stream.keyBy(…), this will logically split your input and all the records with the same key will be processed in the same operator instance. This is the default behavior in Flink for keyed streams and transparently handled. You can read more about it i

Task Assignment

2020-04-22 Thread Navneeth Krishnan
Hi All, Is there a way for an upstream operator to know how the downstream operator tasks are assigned? Basically I want to group my messages to be processed on slots in the same node based on some key. Thanks