Thanks for the clarification. I was just trying to understand the intended 
behavior. It would have been nice if Flink tracked state for downstream 
operators by key, but I can do that with a map in the downstream functions. 

Michael

Sent from my iPad

> On Apr 5, 2018, at 2:30 AM, Fabian Hueske <fhue...@gmail.com> wrote:
> 
> Amit is correct. keyBy() ensures that all records with the same key are 
> processed by the same paralllel instance of a function.
> This is different from "a parallel instance only sees records of one key".
> 
> I had a look at the docs [1]. 
> I agree that "Logically partitions a stream into disjoint partitions, each 
> partition containing elements of the same key." can be easily interpreted as 
> you did.
> I've pushed a commit to clarify the description. The docs should be updated 
> soon.
> 
> Best, Fabian 
> 
> [1] 
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/#datastream-transformations
> 
> 2018-04-05 6:21 GMT+02:00 Amit Jain <aj201...@gmail.com>:
>> Hi,
>> 
>> KeyBy operation partition the data on given key and make sure same slot will
>> get all future data belonging to same key. In default implementation, it can
>> also map subset of keys in your DataStream to same slot.
>> 
>> Assuming you have number of keys equal to number running slot then you may
>> specify your custom keyBy operation to the achieve the same.
>> 
>> 
>> Could you specify your case.
>> 
>> --
>> Thanks
>> Amit
>> 
>> 
>> 
>> --
>> Sent from: 
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
> 

Reply via email to