[ https://issues.apache.org/jira/browse/FLINK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17416477#comment-17416477 ]
Yangze Guo commented on FLINK-24315: ------------------------------------ [~wangyang0918] I think we can add a retry logic in building the watcher and throw a fatal error if the watcher cannot be rebuilt. Could you assign this to me? > Cannot rebuild watcher thread while the K8S API server is unavailable > --------------------------------------------------------------------- > > Key: FLINK-24315 > URL: https://issues.apache.org/jira/browse/FLINK-24315 > Project: Flink > Issue Type: Bug > Components: Deployment / Kubernetes > Affects Versions: 1.14.0, 1.13.2 > Reporter: ouyangwulin > Priority: Major > Fix For: 1.13.3, 1.14.1 > > > In native k8s integration, Flink will try to rebuild the watcher thread if > the API server is temporarily unavailable. However, if the jitter is longer > than the web socket timeout, the rebuilding of the watcher will timeout and > Flink cannot handle the pod event correctly. -- This message was sent by Atlassian Jira (v8.3.4#803005)