[ https://issues.apache.org/jira/browse/FLINK-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16958638#comment-16958638 ]
Zili Chen commented on FLINK-14434: ----------------------------------- FYI I just notice that {{thenApply}} (and so on which doesn't have a {{Async}} suffix) possibly runs in the current thread(instead of the thread executing the previous computation) if the previous computation is completed on {{thenApply}} called. Thus the analysis above is a bit wrong because we don't ensure the starting scheduled to akka-dispatcher but possibly synchronously with {{thenApply}} which means in the MainThread. However, this fix isn't affected by this discovery. Just for further information if this difference becomes significant. > Dispatcher#createJobManagerRunner should returns on creation succeed, not > after startJobManagerRunner > ----------------------------------------------------------------------------------------------------- > > Key: FLINK-14434 > URL: https://issues.apache.org/jira/browse/FLINK-14434 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination > Affects Versions: 1.8.2, 1.10.0, 1.9.1 > Reporter: Zili Chen > Assignee: Zili Chen > Priority: Major > Labels: pull-request-available > Fix For: 1.10.0, 1.8.3, 1.9.2 > > Attachments: patch.diff > > Time Spent: 20m > Remaining Estimate: 0h > > In an edge case, let's said > 1) job finished nearly immediately > 2) Dispatcher has been suspended in {{#startJobManagerRunner}} after > {{jobManagerRunner.start();}} but before {{return jobManagerRunner;}} > due to > 1) we put {{jobManagerRunnerFutures}} with {{#startJobManagerRunner}} > finished. > 2) the creation of JobManagerRunner doesn't happen in MainThread. > it is a possible execution order > 1) JobManagerRunner created in akka-dispatcher thread > 2) then apply {{Dispatcher#startJobManagerRunner}} > 3) until {{jobManagerRunner.start();}} and before {{return jobManagerRunner;}} > 4) this thread suspended > 5) job finished, execute callback on MainThread > 6) {{jobManagerRunnerFutures.get(jobID).getNow(null)}} returns {{null}} > because akka-dispatcher thread doesn't {{return jobManagerRunner;}} > 7) it report {{There is a newer JobManagerRunner for the job}} but actually > not. > **Solution** > Two perspective but we can even have them both. > 1. return {{jobManagerRunnerFuture}} in {{#createJobManagerRunner}}, let > {{#startJobManagerRunner}} an action > 2. on JobManagerRunner created, execute {{#startJobManagerRunner}} in > MainThread. > CC [~trohrmann] -- This message was sent by Atlassian Jira (v8.3.4#803005)