vsantwana commented on code in PR #978:
URL: 
https://github.com/apache/flink-kubernetes-operator/pull/978#discussion_r2077190703


##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java:
##########
@@ -299,9 +303,92 @@ public boolean 
reconcileOtherChanges(FlinkResourceContext<FlinkDeployment> ctx)
             return true;
         }
 
+        // check for JobManager exceptions if the REST API server is still up.
+        if (!ReconciliationUtils.isJobInTerminalState(deployment.getStatus())) 
{
+            observeJobManagerExceptions(ctx, deployment, observeConfig);
+        }
+
         return cleanupTerminalJmAfterTtl(ctx.getFlinkService(), deployment, 
observeConfig);
     }
 
+    private void observeJobManagerExceptions(
+            FlinkResourceContext<FlinkDeployment> ctx,
+            FlinkDeployment deployment,
+            Configuration observeConfig) {
+        try {
+            var jobId = 
JobID.fromHexString(deployment.getStatus().getJobStatus().getJobId());
+            var history = ctx.getFlinkService().getJobExceptions(deployment, 
jobId, observeConfig);
+            if (history == null || history.getExceptionHistory() == null) {
+                return;
+            }
+            var exceptionHistory = history.getExceptionHistory();
+            var exceptions = exceptionHistory.getEntries();
+            if (exceptions.isEmpty()) {
+                LOG.info(String.format("No exceptions found in job exception 
history for jobId '%s'.", jobId));

Review Comment:
   Makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to