lamber-ken edited a comment on issue #8254: [FLINK-12219][runtime] Yarn application can't stop when flink job failed in per-job yarn cluste mode URL: https://github.com/apache/flink/pull/8254#issuecomment-486945344 @tillrohrmann, I have a new idea to fix the bug. The most essential reason is when the `Dispatcher#jobReachedGloballyTerminalState` method throw exception which we don't catched. it affect jobterminal callback. we can catch exception and move callback to finally code block. ### Current code `MiniDispatcher#jobReachedGloballyTerminalState` ``` protected void jobReachedGloballyTerminalState(ArchivedExecutionGraph archivedExecutionGraph) { super.jobReachedGloballyTerminalState(archivedExecutionGraph); if (executionMode == ClusterEntrypoint.ExecutionMode.DETACHED) { // shut down since we don't have to wait for the execution result retrieval jobTerminationFuture.complete(ApplicationStatus.fromJobStatus(archivedExecutionGraph.getState())); } } ``` ### Change to new code ``` protected void jobReachedGloballyTerminalState(ArchivedExecutionGraph archivedExecutionGraph) { try { super.jobReachedGloballyTerminalState(archivedExecutionGraph); } catch (Exception e) { log.error("jobReachedGloballyTerminalState exception", e); } finally { if (executionMode == ClusterEntrypoint.ExecutionMode.DETACHED) { // shut down since we don't have to wait for the execution result retrieval jobTerminationFuture.complete(ApplicationStatus.fromJobStatus(archivedExecutionGraph.getState())); } } } ```
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services