Hi Flink Community,

I'm currently running a heavy flink job on Flink 1.9.3 that has a lot of
subtasks and observing some subtask distribution issues. The job in
question has 9288 sub tasks and they are running on a large set of TMs
(total available slots are 1792).

I'm using the *cluster.evenly-spread-out-slots* configuration option to
have the slots be allocated uniformly but I am still seeing non-uniform
subtask distribution that seems to be affecting performance. It looks like
some TMs are being overloaded and seeing a much greater than uniform
allocation of subtasks. I've been trying to reproduce this situation at a
smaller scale but have not been successful in doing so.

As part of debugging the scheduling process when trying to reproduce this
at a smaller scale I observed that the non-location preference
selectWithoutLocationPreference override introduced by the evenly spread
out strategy option is not being invoked at all as the execution vertices
still have a location preference to be assigned the same slots as their
input vertices.

This happens at job startup time and not during recovery, so I'm not sure
if recovery is where the non preference code path is invoked. In essence
the only impact of using the evenly spread out strategy seems to be a
slightly different candidate score calculation.

I wanted to know:-
1. Is the evenly spread out strategy the right option to choose for
achieving the uniform distribution of subtasks?
2. Is the observed scheduling behaviour expected for the evenly spread out
strategy? When do we expect the non location preference code path to be
invoked? For us this only happens on sources since they have no incoming
edges.

Apart from this I am still trying to understand the nature of scheduling in
Flink and how that could bring about this situation, I was wondering if
there were known issues or peculiarities of the Flink job scheduler that
could lead to this situation occurring. For example I'm looking at the
known issues mentioned in the ticket
https://issues.apache.org/jira/browse/FLINK-11815
.

I was hoping to understand :-
1. The conditions that would give rise to these kinds of situations or how
to verify if we are running into them. For example, how to verify that key
group allocation is non-uniform
2. If these issues have been addressed in subsequent versions of flink
3. If there is any other information about the nature of scheduling jobs in
flink that could give rise to the non-uniform distribution observed.

Please let me know if further information needs to be provided.

Thanks,
Harshit

Reply via email to