[ https://issues.apache.org/jira/browse/FLINK-35489?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicolas Fraison updated FLINK-35489: ------------------------------------ Summary: Metaspace size can be too little after autotuning change memory setting (was: Add capability to set min taskmanager.memory.managed.size when enabling autotuning) > Metaspace size can be too little after autotuning change memory setting > ----------------------------------------------------------------------- > > Key: FLINK-35489 > URL: https://issues.apache.org/jira/browse/FLINK-35489 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator > Affects Versions: 1.8.0 > Reporter: Nicolas Fraison > Priority: Major > > We have enable the autotuning feature on one of our flink job with below > config > {code:java} > # Autoscaler configuration > job.autoscaler.enabled: "true" > job.autoscaler.stabilization.interval: 1m > job.autoscaler.metrics.window: 10m > job.autoscaler.target.utilization: "0.8" > job.autoscaler.target.utilization.boundary: "0.1" > job.autoscaler.restart.time: 2m > job.autoscaler.catch-up.duration: 10m > job.autoscaler.memory.tuning.enabled: true > job.autoscaler.memory.tuning.overhead: 0.5 > job.autoscaler.memory.tuning.maximize-managed-memory: true{code} > During a scale down the autotuning decided to give all the memory to to JVM > (having heap being scale by 2) settting taskmanager.memory.managed.size to 0b. > Here is the config that was compute by the autotuning for a TM running on a > 4GB pod: > {code:java} > taskmanager.memory.network.max: 4063232b > taskmanager.memory.network.min: 4063232b > taskmanager.memory.jvm-overhead.max: 433791712b > taskmanager.memory.task.heap.size: 3699934605b > taskmanager.memory.framework.off-heap.size: 134217728b > taskmanager.memory.jvm-metaspace.size: 22960020b > taskmanager.memory.framework.heap.size: "0 bytes" > taskmanager.memory.flink.size: 3838215565b > taskmanager.memory.managed.size: 0b {code} > This has lead to some issue starting the TM because we are relying on some > javaagent performing some memory allocation outside of the JVM (rely on some > C bindings). > Tuning the overhead or disabling the scale-down-compensation.enabled could > have helped for that particular event but this can leads to other issue as it > could leads to too little HEAP size being computed. > It would be interesting to be able to set a min memory.managed.size to be > taken in account by the autotuning. > What do you think about this? Do you think that some other specific config > should have been applied to avoid this issue? -- This message was sent by Atlassian Jira (v8.20.10#820010)