Canbin Zheng created FLINK-15817:
------------------------------------

             Summary: Kubernetes Resource leak while deployment exception 
happens
                 Key: FLINK-15817
                 URL: https://issues.apache.org/jira/browse/FLINK-15817
             Project: Flink
          Issue Type: Sub-task
          Components: Deployment / Kubernetes
    Affects Versions: 1.10.0
            Reporter: Canbin Zheng
             Fix For: 1.11.0, 1.10.1


When we deploy a new session cluster on Kubernetes cluster, usually there are 
four steps to create the Kubernetes components, and the creation order is as 
below: internal Service -> rest Service -> ConfigMap -> JobManager Deployment.

After the internal Service is created, any Exceptions that fail the cluster 
deployment progress would cause Kubernetes Resource leak, for example:
 #  If failed to create rest Service due to service name 
constraint([FLINK-15816|https://issues.apache.org/jira/browse/FLINK-15816]), 
the internal Service would not be cleaned up when the deploy progress 
terminates.
 # If failed to create JobManager Deployment(a case is that 
_jobmanager.heap.size_ is too small such as 512M, which is less than the 
default configuration value of 'containerized.heap-cutoff-min'), the internal 
Service, the rest Service, and the ConfigMap all leaks.

This ticket proposes to do some clean-ups(cleans the residual Services and 
ConfigMap) if the cluster deployment progress terminates accidentally on the 
client-side.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to