[ https://issues.apache.org/jira/browse/FLINK-9030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Zagrebin closed FLINK-9030. ---------------------------------- Resolution: Abandoned This does not look problematic in the latest Flink 1.11 release. I am closing the issue. Please, reopen if there are still problems and more information, like logs etc. > JobManager fails to archive job to FS when TM is lost > ----------------------------------------------------- > > Key: FLINK-9030 > URL: https://issues.apache.org/jira/browse/FLINK-9030 > Project: Flink > Issue Type: Bug > Components: Deployment / Mesos, Runtime / Coordination > Affects Versions: 1.4.0 > Reporter: Jared Stehler > Priority: Major > > We are running flink on mesos, and are finding that when a job fails due to a > task manager getting lost (from an OOM kill), the job isn't archived properly > into the history server dir on the filesystem. > When this happens, the job does appear in the finished listing in the job > manager's in-memory archive view, and is accessible in the running job > manager's rest api, but obviously not in the history server's rest api. > This is causing us issues as we are using the history server as a system of > record for canceled or failed jobs in order to determine previous savepoint / > external checkpoints. > -- This message was sent by Atlassian Jira (v8.3.4#803005)