Wenlong Lyu created FLINK-16279: ----------------------------------- Summary: Per job Yarn application leak in normal execution mode. Key: FLINK-16279 URL: https://issues.apache.org/jira/browse/FLINK-16279 Project: Flink Issue Type: Bug Components: Runtime / Coordination Affects Versions: 1.10.0 Reporter: Wenlong Lyu
I run a job in yarn per job mode using {{env.executeAsync}}, the job failed but the yarn cluster didn't be destroyed. After some research on the code, I found that: when running in attached mode, MiniDispatcher will neve set {{shutDownfuture}} before received a request from job client. {code} if (executionMode == ClusterEntrypoint.ExecutionMode.NORMAL) { // terminate the MiniDispatcher once we served the first JobResult successfully jobResultFuture.thenAccept((JobResult result) -> { ApplicationStatus status = result.getSerializedThrowable().isPresent() ? ApplicationStatus.FAILED : ApplicationStatus.SUCCEEDED; LOG.debug("Shutting down per-job cluster because someone retrieved the job result."); shutDownFuture.complete(status); }); } {code} However, when running in async mode(submit job by env.executeAsync), there may be no request from job client because when a user find that the job is failed from job client, he may never request the result again. -- This message was sent by Atlassian Jira (v8.3.4#803005)