mxm commented on PR #787:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/787#issuecomment-1978939188

   >In my use case, the job graph comprises only 6 operators and allocates 6 
task slots per task manager. Prior to implementing this improvement, setting 
the maximum parallelism to 18 resulted in frequent rescaling of my Flink job to 
various levels of parallelism for all vertex, such as 7, 8, 13, 15, and 16. 
However, with this enhancement, the Flink job rescales the biggest vertex only 
to parallelism levels of 6, 12, and 18. While it's true that other vertices may 
still experience rescaling to parallelism levels like 7, 13, or 15, the overall 
frequency of rescaling triggered by the Flink autoscaler has significantly 
decreased.
   
   I agree that this improvement is beneficial especially for lower-parallelism 
jobs. I wonder whether it would make sense to align the parallelism with the 
number of task slots, i.e. have the parallelism always be a multiple of the 
number of task slots. This could result in more stable metrics because subtasks 
are equally distributed across the TaskManagers, which should stabilize the 
metrics for each associated job vertex (task).
   
   For example, if the number of task slots is 6, like in your example, the 
minimum parallelism would be 6. The next parallelism 12, 18, 24,... That's 
essentially your idea but generalizing it across all vertices.
   
   The only drawback is that, again, this needs to work with the key group 
alignment that we perform. Long term, it would probably be smarter to adjust 
the number of task slots such that they divide the number of key group without 
a remainder. We can start with adjusting according to multiples of the number 
of task slots configured whenever we do not perform the key group adjustments 
(e.g. no shuffle).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to