[jira] [Updated] (FLINK-35489) Metaspace size can be too little after autotuning change memory setting

Nicolas Fraison (Jira) Thu, 30 May 2024 05:05:18 -0700


     [ 
https://issues.apache.org/jira/browse/FLINK-35489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Nicolas Fraison updated FLINK-35489:
------------------------------------
    Summary: Metaspace size can be too little after autotuning change memory 
setting  (was: Add capability to set min taskmanager.memory.managed.size when 
enabling autotuning)

> Metaspace size can be too little after autotuning change memory setting
> -----------------------------------------------------------------------
>
>                 Key: FLINK-35489
>                 URL: https://issues.apache.org/jira/browse/FLINK-35489
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>    Affects Versions: 1.8.0
>            Reporter: Nicolas Fraison
>            Priority: Major
>
> We have enable the autotuning feature on one of our flink job with below 
> config
> {code:java}
> # Autoscaler configuration
> job.autoscaler.enabled: "true"
> job.autoscaler.stabilization.interval: 1m
> job.autoscaler.metrics.window: 10m
> job.autoscaler.target.utilization: "0.8"
> job.autoscaler.target.utilization.boundary: "0.1"
> job.autoscaler.restart.time: 2m
> job.autoscaler.catch-up.duration: 10m
> job.autoscaler.memory.tuning.enabled: true
> job.autoscaler.memory.tuning.overhead: 0.5
> job.autoscaler.memory.tuning.maximize-managed-memory: true{code}
> During a scale down the autotuning decided to give all the memory to to JVM 
> (having heap being scale by 2) settting taskmanager.memory.managed.size to 0b.
> Here is the config that was compute by the autotuning for a TM running on a 
> 4GB pod:
> {code:java}
>     taskmanager.memory.network.max: 4063232b
>     taskmanager.memory.network.min: 4063232b
>     taskmanager.memory.jvm-overhead.max: 433791712b
>     taskmanager.memory.task.heap.size: 3699934605b
>     taskmanager.memory.framework.off-heap.size: 134217728b
>     taskmanager.memory.jvm-metaspace.size: 22960020b
>     taskmanager.memory.framework.heap.size: "0 bytes"
>     taskmanager.memory.flink.size: 3838215565b
>     taskmanager.memory.managed.size: 0b {code}
> This has lead to some issue starting the TM because we are relying on some 
> javaagent performing some memory allocation outside of the JVM (rely on some 
> C bindings).
> Tuning the overhead or disabling the scale-down-compensation.enabled could 
> have helped for that particular event but this can leads to other issue as it 
> could leads to too little HEAP size being computed.
> It would be interesting to be able to set a min memory.managed.size to be 
> taken in account by the autotuning.
> What do you think about this? Do you think that some other specific config 
> should have been applied to avoid this issue?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (FLINK-35489) Metaspace size can be too little after autotuning change memory setting

Reply via email to