Hi Till, thanks for the clarification. It all makes sense now.
So the keyBy call is more a partitioning scheme and less of an operator, similar to Storm's field grouping, and Flink's other schemes such as forward and broadcast. The difference is that it produces KeyedStreams, which are a prerequisite for certain types of transformations. Regards Leon 8. Jun 2016 14:05 by trohrm...@apache.org: > Hi Leon, > yes, you're right. The plan visualization shows the actual tasks. Each task > can contain one or more (if chained) operators. A task is split into > sub-tasks which are executed on the TaskManagers. A TaskManager slot can > accommodate one subtask of each task (if the task has not been assigned to > a different slot sharing group). Thus (per default) the number of required > slots is defined by the operator with the highest parallelism. > If you click on the different tasks, then you see in the list view at the > bottom of the page, where the individual sub-tasks have been deployed to. > The keyBy API call is actually not realized as a distinct Flink operator at > runtime. Instead, the keyBy transformation influences how a downstream > operator is connected with the upstream operator (pointwise or all to all). > Furthermore, the keyBy transformation sets a special StreamPartitioner > (HashPartitioner) which is used by the StreamRecordWriters to select the > output channels (receiving sub tasks) for the current stream record. That > is the reason why you don't see the keyBy transformation in the plan > visualization. Consequently, the illustration on our website is not totally > consistent with the actual implementation. > > Cheers,> Till > On Wed, Jun 8, 2016 at 11:01 AM, <> leon_mcl...@tutanota.com> > wrote: > >> >> I am using the dashboard to inspect my multi stage pipeline. >> I cannot seem to find a manual or other description for the dashboard >> aside from the quickstart section (>> >> https://ci.apache.org/projects/flink/flink-docs-release-1.0/quickstart/setup_quickstart.html>> >> >> ). >> >> I would like to know how approximately my physical layout (in terms of >> task slots, tasks and task chaining) looks like. I am assuming that each >> block in the attached topology plan is an operation that will execute as >> multiple parallel tasks based on the assigned parallelism. Multilevel >> operator chaining has been applied in most cases. The plan then occupies 3 >> task slots, as dictated by the highest DOP. Is this correct? >> >> What i am also unsure about is that i have keyBy transformations in my >> code, yet they are not shown as transformations in the plan. The >> corresponding section on workers and slots (>> >> https://ci.apache.org/projects/flink/flink-docs-release-1.0/concepts/concepts.html#workers-slots-resources>> >> >> ) however explicitly shows keyBy in the layout. >> >> Regards >> Leon >> > >