Matthias Pohl created FLINK-29223:
-------------------------------------

             Summary: Missing debug output for when filtering JobGraphs based 
on their persisted JobResult
                 Key: FLINK-29223
                 URL: https://issues.apache.org/jira/browse/FLINK-29223
             Project: Flink
          Issue Type: Improvement
          Components: Runtime / Coordination
            Reporter: Matthias Pohl


We have the case where we don't see (in the logs) a job being registered in the 
\{[JobResultStore}} after it reached a globally-terminal state (HA-mode 
enabled).

We would have expected the job to be picked up again for recovery after the JM 
failover which didn't happen as well. We're missing a debug statement here that 
would help us identify the case that the job was actually registered in the 
{{JobResultStore}} but the [log message 
afterwards|https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L1145]
 isn't printed.

We could fix that by adding some info logs for the filtering mechanism when 
recovering the jobs as a {{else}} branch in 
[SessionDispatcherLeaderProcess:149|https://github.com/apache/flink/blob/63817b5ffdf7ba24a168aeec95464d13e4d78e13/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/SessionDispatcherLeaderProcess.java#L149]
 (and in 
[JobDispatcherLeaderProcessFactoryFactory|/home/mapohl/workspace/flink-master/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/runner/JobDispatcherLeaderProcessFactoryFactory.java]
 accordingly)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to