Hi Jürgen,
In a nutshell, Flink's scheduling works the following way: The sources are
deployed wrt to local preferences. If there are no local preferences then
the first machine from a map's iterator which stores the machines is used.
So in general, the sources will first fill up the first availab
I'm sorry for the delay. I've added Till who knows the scheduler details to
the conversation.
On Tue, Oct 18, 2016 at 3:09 PM, Jürgen Thomann <
juergen.thom...@innogames.com> wrote:
> Hi Robert,
>
> Do you already had a chance to look on it? If you need more information
> just let me know.
>
> Re
Hi Robert,
Do you already had a chance to look on it? If you need more information
just let me know.
Regards,
Jürgen
On 12.10.2016 21:12, Jürgen Thomann wrote:
Hi Robert,
Thanks for your suggestions. We are using the DataStream API and I
tried it with disabling it completely, but that did
Hi Robert,
Thanks for your suggestions. We are using the DataStream API and I tried
it with disabling it completely, but that didn't help.
I attached the plan and to add some context, it starts with a Kafka
source followed by a map operation ( parallelism 4). The next map is the
expensive pa
Hi Jürgen,
Are you using the DataStream or the DataSet API?
Maybe the operator chaining is causing too many operations to be "packed"
into one task. Check out this documentation page:
https://ci.apache.org/projects/flink/flink-docs-master/dev/datastream_api.html#task-chaining-and-resource-groups
Hi,
we currently have an issue with Flink, as it allocates many tasks to the
same task manager and as a result it overloads it. I reduced the amount
of task slots per task manager (keeping the CPU count) and added some
more servers but that did not help to distribute the load.
Is there some