mateczagany opened a new pull request, #777:
URL: https://github.com/apache/flink-kubernetes-operator/pull/777

   ## What is the purpose of the change
   
   Waiting for TaskManager and JobManager pods to be removed might cause issues 
in a slow Kubernetes cluster. Upon upgrade, there is a slim possibility that 
the operator will attempt to create a new `Deployment` before the last one was 
removed. This was mitigated by 
[FLINK-32334](https://issues.apache.org/jira/browse/FLINK-32334), which worked 
well for `standalone`  mode.
   
   I wanted to simply implement `getTmPodList` for `native` mode at first, but 
after some investigation it became clear that all Kubernetes resources created 
by the operator or the JobManager will be children of a `Deployment` with 
`blockOwnerDeletion: true`, so it made sense to instead change the logic to 
poll the `Deployment` resources.
   
   I also removed the `Thread.sleep()` calls, and instead use the Fabric8 
Informers API, which will use websockets to ease the stress on the Kubernetes 
cluster during cluster deletion.
   
   ## Brief change log
   
   - Changed `AbstractFlinkService#waitUntilCondition` to use Informer API and 
check for `Deployment` resources.
   - Added unit tests for the new method 
`AbstractFlinkService#waitForDeploymentToBeRemoved`.
   - Added log when cluster shutdown timeout is exceeded
   
   ## Verifying this change
   
   - Added new unit test
   - Manual verification of timeout and 
   
   ## Does this pull request potentially affect one of the following parts:
   
     - Dependencies (does it add or upgrade a dependency): no
     - The public API, i.e., is any changes to the `CustomResourceDescriptors`: 
no
     - Core observer or reconciler logic that is regularly executed: no
   
   ## Documentation
   
     - Does this pull request introduce a new feature? no
     - If yes, how is the feature documented? not applicable
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to