[
https://issues.apache.org/jira/browse/FLINK-34152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17825218#comment-17825218
]
Rui Fan commented on FLINK-34152:
---------------------------------
Merged to main(1.8.0) via:
* 48692b7243086a2ccf2cb4c64a6b00306fa1d65f
* 3ead906f33a4b3790fa5c12f2018e09db9443a09
* d526174d9ab4c3ccf92548c821d9f44acbd3f247
> Tune TaskManager memory
> -----------------------
>
> Key: FLINK-34152
> URL: https://issues.apache.org/jira/browse/FLINK-34152
> Project: Flink
> Issue Type: Sub-task
> Components: Autoscaler, Kubernetes Operator
> Reporter: Maximilian Michels
> Assignee: Maximilian Michels
> Priority: Major
> Labels: pull-request-available
> Fix For: kubernetes-operator-1.8.0
>
>
> The current autoscaling algorithm adjusts the parallelism of the job task
> vertices according to the processing needs. By adjusting the parallelism, we
> systematically scale the amount of CPU for a task. At the same time, we also
> indirectly change the amount of memory tasks have at their dispense. However,
> there are some problems with this.
> # Memory is overprovisioned: On scale up we may add more memory than we
> actually need. Even on scale down, the memory / cpu ratio can still be off
> and too much memory is used.
> # Memory is underprovisioned: For stateful jobs, we risk running into
> OutOfMemoryErrors on scale down. Even before running out of memory, too
> little memory can have a negative impact on the effectiveness of the scaling.
> We lack the capability to tune memory proportionally to the processing needs.
> In the same way that we measure CPU usage and size the tasks accordingly, we
> need to evaluate memory usage and adjust the heap memory size.
> https://docs.google.com/document/d/19GXHGL_FvN6WBgFvLeXpDABog2H_qqkw1_wrpamkFSc/edit
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)