[ 
https://issues.apache.org/jira/browse/FLINK-15434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17006260#comment-17006260
 ] 

hailong wang commented on FLINK-15434:
--------------------------------------

I reproduced this problem on my laptop, and the debug log as follow:
{code:java}
 [flink-akka.actor.default-dispatcher-4] INFO  akka.event.slf4j.Slf4jLogger  - 
Slf4jLogger started0    [flink-akka.actor.default-dispatcher-4] INFO  
akka.event.slf4j.Slf4jLogger  - Slf4jLogger started72   
[flink-akka.actor.default-dispatcher-4] DEBUG akka.event.EventStream  - logger 
log1-Slf4jLogger started79   [flink-akka.actor.default-dispatcher-4] DEBUG 
akka.event.EventStream  - Default Loggers started1165 [main] INFO  
org.apache.flink.runtime.jobmaster.JobMasterTest  - 
================================================================================Test
 
testResourceManagerConnectionAfterRegainingLeadership(org.apache.flink.runtime.jobmaster.JobMasterTest)
 is 
running.--------------------------------------------------------------------------------4106
 [main] INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Starting RPC 
endpoint for org.apache.flink.runtime.jobmaster.JobMaster at 
akka://flink/user/jobmanager_0 .4171 [main] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Initializing job (unnamed job) 
(f781d92dbe44f52505eff1b144e45bb6).4268 [main] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Using restart strategy 
NoRestartStrategy for (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4362 
[main] INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job 
recovers via failover strategy: full graph restart4433 [main] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Running initialization on 
master for job (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4433 [main] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Successfully ran 
initialization on master in 0 ms.4439 [main] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Adding 0 vertices from job 
graph (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4439 [main] DEBUG 
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Attaching 0 
topologically sorted vertices to existing job graph with 0 vertices and 0 
intermediate results.4484 [main] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Successfully created execution 
graph from job graph (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4518 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Starting execution of job 
(unnamed job) (f781d92dbe44f52505eff1b144e45bb6) under job master id 
5b6d296cfdedc261c4bbd525aea971b3.4521 [flink-akka.actor.default-dispatcher-4] 
INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job (unnamed 
job) (f781d92dbe44f52505eff1b144e45bb6) switched from state CREATED to 
RUNNING.4533 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4537 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4543 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Connecting to ResourceManager 
localhost/cc67ffdf-cb14-4f8a-bf85-ddd6f11327b2(5421d4878888c36920c6ee4eb0136801)4556
 [flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Resolved ResourceManager 
address, beginning registration4557 [flink-akka.actor.default-dispatcher-2] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Registration at 
ResourceManager attempt 1 (timeout=100ms)4565 
[flink-akka.actor.default-dispatcher-3] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4594 
[flink-akka.actor.default-dispatcher-3] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.into 
5b6d296cfdedc261c4bbd525aea971b3, f781d92dbe44f52505eff1b144e45bb64619 
[flink-akka.actor.default-dispatcher-3] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - JobManager successfully 
registered at ResourceManager, leader id: 5421d4878888c36920c6ee4eb0136801.4629 
[flink-akka.actor.default-dispatcher-3] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4656 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4657 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - The heartbeat of 
ResourceManager with id 4cce21c1eac0cad2675e364a1d0f41c1 timed out.4668 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Close ResourceManager 
connection 
4cce21c1eac0cad2675e364a1d0f41c1.org.apache.flink.runtime.jobmaster.JobMasterException:
 The heartbeat of ResourceManager with id 4cce21c1eac0cad2675e364a1d0f41c1 
timed out. at 
org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerHeartbeatListener.notifyHeartbeatTimeout(JobMaster.java:1179)
 at 
org.apache.flink.runtime.heartbeat.HeartbeatManagerImpl$HeartbeatMonitor.run(HeartbeatManagerImpl.java:318)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
 at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at 
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
akka.actor.Actor$class.aroundReceive(Actor.scala:517) at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at 
akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at 
akka.actor.ActorCell.invoke(ActorCell.scala:561) at 
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at 
akka.dispatch.Mailbox.run(Mailbox.scala:225) at 
akka.dispatch.Mailbox.exec(Mailbox.scala:235) at 
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)4679
 [flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Connecting to ResourceManager 
localhost/cc67ffdf-cb14-4f8a-bf85-ddd6f11327b2(5421d4878888c36920c6ee4eb0136801)4682
 [flink-akka.actor.default-dispatcher-6] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Resolved ResourceManager 
address, beginning registration4683 [flink-akka.actor.default-dispatcher-6] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Registration at 
ResourceManager attempt 1 (timeout=100ms)into 5b6d296cfdedc261c4bbd525aea971b3, 
f781d92dbe44f52505eff1b144e45bb64682 [flink-akka.actor.default-dispatcher-4] 
DEBUG org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat 
request.4685 [flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - JobManager successfully 
registered at ResourceManager, leader id: 5421d4878888c36920c6ee4eb0136801.4686 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4713 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job (unnamed job) 
(f781d92dbe44f52505eff1b144e45bb6) switched from state RUNNING to 
SUSPENDED.org.apache.flink.util.FlinkException: Test exception. at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testResourceManagerConnectionAfterRegainingLeadership(JobMasterTest.java:1025)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)4730 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job 
f781d92dbe44f52505eff1b144e45bb6 has been suspended.4731 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl  - Suspending 
SlotPool.4731 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Close ResourceManager 
connection 
4cce21c1eac0cad2675e364a1d0f41c1.org.apache.flink.util.FlinkException: Test 
exception. at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testResourceManagerConnectionAfterRegainingLeadership(JobMasterTest.java:1025)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)4733 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor  - Fencing token not set: 
Ignoring message LocalFencedMessage(5b6d296cfdedc261c4bbd525aea971b3, 
org.apache.flink.runtime.rpc.messages.RunAsync@4826bb87) because the fencing 
token is null.4739 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor  - Fencing token not set: 
Ignoring message LocalFencedMessage(5b6d296cfdedc261c4bbd525aea971b3, 
org.apache.flink.runtime.rpc.messages.RunAsync@5ebd3007) because the fencing 
token is null.4740 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor  - Fencing token not set: 
Ignoring message LocalFencedMessage(null, 
org.apache.flink.runtime.rpc.messages.RunAsync@53eb8b7a) because the fencing 
token is null.4744 [flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Starting execution of job 
(unnamed job) (f781d92dbe44f52505eff1b144e45bb6) under job master id 
8db9e94add813aacf2794eb6881a1573.4747 [flink-akka.actor.default-dispatcher-4] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Using restart strategy 
NoRestartStrategy for (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4749 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job recovers via 
failover strategy: full graph restart4749 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Running initialization on 
master for job (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4749 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Successfully ran initialization 
on master in 0 ms.4749 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Adding 0 vertices from job 
graph (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4750 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Attaching 0 
topologically sorted vertices to existing job graph with 0 vertices and 0 
intermediate results.4750 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Successfully created execution 
graph from job graph (unnamed job) (f781d92dbe44f52505eff1b144e45bb6).4755 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job (unnamed job) 
(f781d92dbe44f52505eff1b144e45bb6) switched from state CREATED to RUNNING.4756 
[flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat 
request.5b6d296cfdedc261c4bbd525aea971b34758 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Connecting to ResourceManager 
localhost/cc67ffdf-cb14-4f8a-bf85-ddd6f11327b2(5421d4878888c36920c6ee4eb0136801)4758
 [flink-akka.actor.default-dispatcher-4] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4758 
[flink-akka.actor.default-dispatcher-2] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Resolved ResourceManager 
address, beginning registration4758 [flink-akka.actor.default-dispatcher-2] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Registration at 
ResourceManager attempt 1 (timeout=100ms)5b6d296cfdedc261c4bbd525aea971b3into 
8db9e94add813aacf2794eb6881a1573, f781d92dbe44f52505eff1b144e45bb64760 
[flink-akka.actor.default-dispatcher-4] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - JobManager successfully 
registered at ResourceManager, leader id: 5421d4878888c36920c6ee4eb0136801.4787 
[flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - The heartbeat of 
ResourceManager with id 4cce21c1eac0cad2675e364a1d0f41c1 timed out.4787 
[flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Close ResourceManager 
connection 
4cce21c1eac0cad2675e364a1d0f41c1.org.apache.flink.runtime.jobmaster.JobMasterException:
 The heartbeat of ResourceManager with id 4cce21c1eac0cad2675e364a1d0f41c1 
timed out. at 
org.apache.flink.runtime.jobmaster.JobMaster$ResourceManagerHeartbeatListener.notifyHeartbeatTimeout(JobMaster.java:1179)
 at 
org.apache.flink.runtime.heartbeat.HeartbeatManagerImpl$HeartbeatMonitor.run(HeartbeatManagerImpl.java:318)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at 
java.util.concurrent.FutureTask.run(FutureTask.java:266) at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync(AkkaRpcActor.java:397)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:190)
 at 
org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:74)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:152)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at 
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
akka.actor.Actor$class.aroundReceive(Actor.scala:517) at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at 
akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at 
akka.actor.ActorCell.invoke(ActorCell.scala:561) at 
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at 
akka.dispatch.Mailbox.run(Mailbox.scala:225) at 
akka.dispatch.Mailbox.exec(Mailbox.scala:235) at 
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)4788
 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Connecting to ResourceManager 
localhost/cc67ffdf-cb14-4f8a-bf85-ddd6f11327b2(5421d4878888c36920c6ee4eb0136801)4790
 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Trigger heartbeat request.4790 
[flink-akka.actor.default-dispatcher-7] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Resolved ResourceManager 
address, beginning registration4793 [flink-akka.actor.default-dispatcher-7] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - Registration at 
ResourceManager attempt 1 (timeout=100ms)into 8db9e94add813aacf2794eb6881a1573, 
f781d92dbe44f52505eff1b144e45bb64796 [flink-akka.actor.default-dispatcher-5] 
INFO  org.apache.flink.runtime.jobmaster.JobMaster  - JobManager successfully 
registered at ResourceManager, leader id: 5421d4878888c36920c6ee4eb0136801.4796 
[flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.jobmaster.JobMaster  - Stopping the JobMaster for job 
(unnamed job)(f781d92dbe44f52505eff1b144e45bb6).4798 
[flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job (unnamed job) 
(f781d92dbe44f52505eff1b144e45bb6) switched from state RUNNING to 
SUSPENDED.org.apache.flink.util.FlinkException: JobManager is shutting down. at 
org.apache.flink.runtime.jobmaster.JobMaster.onStop(JobMaster.java:347) at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:509)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:175)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at 
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
akka.actor.Actor$class.aroundReceive(Actor.scala:517) at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at 
akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at 
akka.actor.ActorCell.invoke(ActorCell.scala:561) at 
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at 
akka.dispatch.Mailbox.run(Mailbox.scala:225) at 
akka.dispatch.Mailbox.exec(Mailbox.scala:235) at 
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)4803
 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph  - Job 
f781d92dbe44f52505eff1b144e45bb6 has been suspended.4803 
[flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl  - Suspending 
SlotPool.4804 [flink-akka.actor.default-dispatcher-5] DEBUG 
org.apache.flink.runtime.jobmaster.JobMaster  - Close ResourceManager 
connection 
4cce21c1eac0cad2675e364a1d0f41c1.org.apache.flink.util.FlinkException: 
JobManager is shutting down. at 
org.apache.flink.runtime.jobmaster.JobMaster.onStop(JobMaster.java:347) at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor$StartedState.terminate(AkkaRpcActor.java:509)
 at 
org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleControlMessage(AkkaRpcActor.java:175)
 at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) at 
akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) at 
scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) at 
akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) at 
scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) at 
akka.actor.Actor$class.aroundReceive(Actor.scala:517) at 
akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) at 
akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) at 
akka.actor.ActorCell.invoke(ActorCell.scala:561) at 
akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) at 
akka.dispatch.Mailbox.run(Mailbox.scala:225) at 
akka.dispatch.Mailbox.exec(Mailbox.scala:235) at 
akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at 
akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) 
at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at 
akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)4805
 [flink-akka.actor.default-dispatcher-5] INFO  
org.apache.flink.runtime.jobmaster.slotpool.SlotPoolImpl  - Stopping 
SlotPool.4916 [main] ERROR org.apache.flink.runtime.jobmaster.JobMasterTest  - 
--------------------------------------------------------------------------------Test
 
testResourceManagerConnectionAfterRegainingLeadership(org.apache.flink.runtime.jobmaster.JobMasterTest)
 failed with:java.lang.AssertionError: Expected: 
<8db9e94add813aacf2794eb6881a1573>     but: was 
<5b6d296cfdedc261c4bbd525aea971b3> at 
org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testResourceManagerConnectionAfterRegainingLeadership(JobMasterTest.java:1034)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)
================================================================================4942
 [main] INFO  org.apache.flink.runtime.rpc.akka.AkkaRpcService  - Stopping Akka 
RPC service.
java.lang.AssertionError: Expected: <8db9e94add813aacf2794eb6881a1573>     but: 
was <5b6d296cfdedc261c4bbd525aea971b3>Expected 
:<8db9e94add813aacf2794eb6881a1573>Actual   
:<5b6d296cfdedc261c4bbd525aea971b3><Click to see difference> at 
org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8) at 
org.apache.flink.runtime.jobmaster.JobMasterTest.testResourceManagerConnectionAfterRegainingLeadership(JobMasterTest.java:1034)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
 at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
 at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) at 
org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) at 
org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) at 
org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) at 
org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:48) at 
org.junit.rules.RunRules.evaluate(RunRules.java:20) at 
org.junit.runners.ParentRunner.run(ParentRunner.java:363) at 
org.junit.runner.JUnitCore.run(JUnitCore.java:137) at 
com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:68)
 at 
com.intellij.rt.junit.IdeaTestRunner$Repeater.startRunnerWithArgs(IdeaTestRunner.java:33)
 at 
com.intellij.rt.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:230)
 at com.intellij.rt.junit.JUnitStarter.main(JUnitStarter.java:58)Process 
finished with exit code 255{code}
According to the log, for the lost heartbeat of ResourceManager, It will 
reconnect. So the registrationQueue's element will be old JobMasterId.

So I think we can look if registrationQueue is empty, if so, we delete  it and 
offer new value then.
{code:java}
testingResourceManagerGateway.setRegisterJobManagerConsumer(
   jobMasterIdResourceIDStringJobIDTuple4 -> {
      if (!registrationQueue.isEmpty()) {
         registrationQueue.poll();
      }
      registrationQueue.offer(jobMasterIdResourceIDStringJobIDTuple4.f0);
   });
{code}
 

> testResourceManagerConnectionAfterRegainingLeadership test fail when run  
> azure
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-15434
>                 URL: https://issues.apache.org/jira/browse/FLINK-15434
>             Project: Flink
>          Issue Type: Bug
>          Components: Tests
>    Affects Versions: 1.9.1
>            Reporter: hailong wang
>            Priority: Major
>             Fix For: 1.10.0
>
>
> Error message
> Expected: <b65797325db6323dd5f8bdfeeaa18467>
>  but: was <57f975298d116a9d7623f5a844ce6502>
>  
> Stack trace
> java.lang.AssertionError: Expected: <b65797325db6323dd5f8bdfeeaa18467> but: 
> was <57f975298d116a9d7623f5a844ce6502> at 
> org.apache.flink.runtime.jobmaster.JobMasterTest.testResourceManagerConnectionAfterRegainingLeadership(JobMasterTest.java:1033)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to