Thanks for the clarification. I was just trying to understand the intended behavior. It would have been nice if Flink tracked state for downstream operators by key, but I can do that with a map in the downstream functions.
Michael Sent from my iPad > On Apr 5, 2018, at 2:30 AM, Fabian Hueske <fhue...@gmail.com> wrote: > > Amit is correct. keyBy() ensures that all records with the same key are > processed by the same paralllel instance of a function. > This is different from "a parallel instance only sees records of one key". > > I had a look at the docs [1]. > I agree that "Logically partitions a stream into disjoint partitions, each > partition containing elements of the same key." can be easily interpreted as > you did. > I've pushed a commit to clarify the description. The docs should be updated > soon. > > Best, Fabian > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/#datastream-transformations > > 2018-04-05 6:21 GMT+02:00 Amit Jain <aj201...@gmail.com>: >> Hi, >> >> KeyBy operation partition the data on given key and make sure same slot will >> get all future data belonging to same key. In default implementation, it can >> also map subset of keys in your DataStream to same slot. >> >> Assuming you have number of keys equal to number running slot then you may >> specify your custom keyBy operation to the achieve the same. >> >> >> Could you specify your case. >> >> -- >> Thanks >> Amit >> >> >> >> -- >> Sent from: >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/ >