I'm sorry for the delay. I've added Till who knows the scheduler details to the conversation.
On Tue, Oct 18, 2016 at 3:09 PM, Jürgen Thomann < [email protected]> wrote: > Hi Robert, > > Do you already had a chance to look on it? If you need more information > just let me know. > > Regards, > Jürgen > > > On 12.10.2016 21:12, Jürgen Thomann wrote: > >> >> Hi Robert, >> >> Thanks for your suggestions. We are using the DataStream API and I tried >> it with disabling it completely, but that didn't help. >> >> I attached the plan and to add some context, it starts with a Kafka >> source followed by a map operation ( parallelism 4). The next map is the >> expensive part with a parallelism of 18 which produces a Tuple2 which is >> used for splitting. Starting here the parallelism is always 2 except the >> sink with 1. Both resulting streams have two maps, a filter, one more map >> and are ending with an assignTimestampsAndWatermarks. If there is now a >> small box in the picture it is a filter operation and otherwise it goes >> directly to a keyBy, timewindow and apply operation followed by a sink. >> >> If one task manager contains more sub tasks of the expensive map than any >> other task manager, everything later in the stream is running on the same >> task manager. If two task manager have the same amount of sub tasks, the >> following tasks with a parallelism of 2 are distributed over the two task >> manager. >> >> Interesting is also that the task manager have 6 task slots configured >> and the expensive part has 6 sub tasks on one task manager but still >> everything later in the flow is running on this task manager. This also >> happens if operator chaining is disabled. >> >> Best, >> Jürgen >> >> >> On 12.10.2016 17:43, Robert Metzger wrote: >> >>> Hi Jürgen, >>> >>> Are you using the DataStream or the DataSet API? >>> Maybe the operator chaining is causing too many operations to be >>> "packed" into one task. Check out this documentation page: >>> https://ci.apache.org/projects/flink/flink-docs-master/dev/ >>> datastream_api.html#task-chaining-and-resource-groups >>> You could try to disable chaining completely to see if that resolves the >>> issue (you'll probably pay for this by having more serialization overhead >>> and network traffic). >>> >>> If my suggestions don't help, can you post a screenshot of your job plan >>> (from the web interface) here, so that we see what operations you are >>> performing? >>> >>> Regards, >>> Robert >>> >>>
