[ https://issues.apache.org/jira/browse/FLINK-36557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17890258#comment-17890258 ]
Sai Sharath Dandi commented on FLINK-36557: ------------------------------------------- We're trying to build an autoscaler solution for YARN following the same pattern as Kubernetes and observed this problem where applications may not get scaled sometimes. I think the same problem applies to the kubernetes-operator unless I missed something in the code. > Stale Autoscaler Context in Kubernetes Operator > ----------------------------------------------- > > Key: FLINK-36557 > URL: https://issues.apache.org/jira/browse/FLINK-36557 > Project: Flink > Issue Type: Improvement > Components: Autoscaler, Kubernetes Operator > Reporter: Sai Sharath Dandi > Priority: Minor > > The KubernetesJobAutoScalerContext is > [cached|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/controller/FlinkResourceContext.java#L59] > in the FlinkResourceContext and reused. If the JobAutoscalerContext is > initialized before the job reaches Running state, it can cause the autoscaler > to not trigger - > [link|[https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler/src/main/java/org/apache/flink/autoscaler/JobAutoScalerImpl.java#L98].] > > We need to either refresh the AutoScalerContext similar to the standalone > [implementation|https://github.com/apache/flink-kubernetes-operator/blob/main/flink-autoscaler-standalone/src/main/java/org/apache/flink/autoscaler/standalone/StandaloneAutoscalerExecutor.java#L127] > or the autoscaler module itself needs to refresh the job status -- This message was sent by Atlassian Jira (v8.20.10#820010)