[ https://issues.apache.org/jira/browse/FLINK-21008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Till Rohrmann updated FLINK-21008: ---------------------------------- Fix Version/s: 1.13.0 > ClusterEntrypoint#shutDownAsync may not be fully executed > --------------------------------------------------------- > > Key: FLINK-21008 > URL: https://issues.apache.org/jira/browse/FLINK-21008 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 1.11.3, 1.12.1 > Reporter: Yang Wang > Priority: Major > Fix For: 1.13.0 > > > Recently, in our internal use case for native K8s integration with K8s HA > enabled, we found that the leader related ConfigMaps could be residual in > some corner situations. > After some investigations, I think it is possibly caused by the inappropriate > shutdown process. > In {{ClusterEntrypoint#shutDownAsync}}, we first call the > {{closeClusterComponent}}, which also includes deregistering the Flink > application from cluster management(e.g. Yarn, K8s). Then we call the > {{stopClusterServices}} and {{cleanupDirectories}}. Imagine that the cluster > management do the deregister very fast, the JobManager process receives > SIGNAL 15 before or is being executing the {{stopClusterServices}} and > {{cleanupDirectories}}. The jvm process will directly exit then. So the two > methods may not be executed. -- This message was sent by Atlassian Jira (v8.3.4#803005)