Re: Distributing Tasks over Task manager

2016-10-26 Thread Till Rohrmann
Hi Jürgen, In a nutshell, Flink's scheduling works the following way: The sources are deployed wrt to local preferences. If there are no local preferences then the first machine from a map's iterator which stores the machines is used. So in general, the sources will first fill up the first availab

Re: Distributing Tasks over Task manager

2016-10-26 Thread Robert Metzger
I'm sorry for the delay. I've added Till who knows the scheduler details to the conversation. On Tue, Oct 18, 2016 at 3:09 PM, Jürgen Thomann < juergen.thom...@innogames.com> wrote: > Hi Robert, > > Do you already had a chance to look on it? If you need more information > just let me know. > > Re

Re: Distributing Tasks over Task manager

2016-10-18 Thread Jürgen Thomann
Hi Robert, Do you already had a chance to look on it? If you need more information just let me know. Regards, Jürgen On 12.10.2016 21:12, Jürgen Thomann wrote: Hi Robert, Thanks for your suggestions. We are using the DataStream API and I tried it with disabling it completely, but that did

Re: Distributing Tasks over Task manager

2016-10-12 Thread Jürgen Thomann
Hi Robert, Thanks for your suggestions. We are using the DataStream API and I tried it with disabling it completely, but that didn't help. I attached the plan and to add some context, it starts with a Kafka source followed by a map operation ( parallelism 4). The next map is the expensive pa

Re: Distributing Tasks over Task manager

2016-10-12 Thread Robert Metzger
Hi Jürgen, Are you using the DataStream or the DataSet API? Maybe the operator chaining is causing too many operations to be "packed" into one task. Check out this documentation page: https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groups

Distributing Tasks over Task manager

2016-10-12 Thread Jürgen Thomann
Hi, we currently have an issue with Flink, as it allocates many tasks to the same task manager and as a result it overloads it. I reduced the amount of task slots per task manager (keeping the CPU count) and added some more servers but that did not help to distribute the load. Is there some