Hi Jan,

As far as I remember, Flink doesn't handle very well cases like (1-2-1-1-1)
and two Task Managers. There are no guarantees how the operators/subtasks
are going to be scheduled, but most likely it will be as you
mentioned/observed. First task manager will be handling all of the
operators, while the second task manager will only be running a single
instance of the second operator (for load balancing reasons it would be
better to spread the tasks across those two Task Managers more evenly).

No, Flink doesn't hold any resources (appart of network buffers) per task.
All of the available memory and CPU resources are shared across all of the
running tasks. So in the (1-2-1-1-1) case, if the first task manager will
be overloaded (for example if it has very few CPU cores), the second task
will perform much better on the second task manager (which will be empty),
causing a throughput skew. From this perspective, (2-2-2-2-2) would most
likely be performing better, as the load would be more evenly spread.

Piotrek

niedz., 28 lut 2021 o 13:10 Jan Nitschke <nitschke...@web.de> napisaƂ(a):

> Hello,
>
> We are working on a project where we want to gather information about the
> job performance across different task level parallelism settings.
> Essentially, we want to see how the throughput of a single task varies
> across different parallelism settings, e.g. for a job of 5 tasks: 1-1-1-1-1
> vs. 1-2-1-1-1 vs. 2-2-2-2-2.
>
> *We are running flink on Kubernetes, a job with 5 tasks, slot sharing is
> enabled, operator chasing is disabled and each task manager has one slot.*
>
> So, the number of task managers is always the number of the highest
> parallelism and wen can fit the entire job into one task manager slot.
>
> We are then running the job against multiple parallelism configs (such as
> those above), collect the relevant metrics and try to get some useful
> information out of them.
>
> We are now wondering how independent our results are from one another.
> More specifically, if we now look at the parallelism of the second task, is
> its performance independent of the parallelism of the other tasks? So, will
> a the second task perform the same in (1-2-1-1-1) as in (2-2-2-2-2)?
>
> Our take on it is the following: With our setup, (1-2-1-1-1) should result
> in one task manager holding the entire job and a second task manager that
> only runs the second task. (2-2-2-2-2) will run two task managers with the
> entire job. So, theoretically, the second task should have much more
> resources available in the first setup as it has the entire resources of
> that task manager to its disposal. Does that assumption hold or will flink
> assign a certain amount of resources to a task in a task manager no matter
> how many other tasks are running on that same task manager slot?
>
> We would highly appreciate any help.
>
> Best,
> Jan
>

Reply via email to