Hi ynagireddy4u, We have met this exception before. Usually it is caused by following reasons:
1), TaskManager is too busy with other works to send the heartbeat to JobMaster or TaskManager process might already exited; 2), There might be a network issues between this TaskManager and JobMaster; 3), In certain cases, JobMaster actor might also being too busy to process the RPC requests from TaskManager; Pls check if your problem fits the above situations. Best, Xiangyu Y SREEKARA BHARGAVA REDDY <ynagiredd...@gmail.com> 于2023年7月31日周一 20:49写道: > Hi Team, > > Did any one face the below exception. > If yes, please share the resolution. > > > 2023-07-28 22:04:16 > j*ava.util.concurrent.TimeoutException: Heartbeat of TaskManager with id > container_e19_1690528962823_0382_01_000005 timed out.* > at org.apache.flink.runtime.jobmaster. > JobMaster$TaskManagerHeartbeatListener.notifyHeartbeatTimeout(JobMaster > .java:1147) > at org.apache.flink.runtime.heartbeat.HeartbeatMonitorImpl.run( > HeartbeatMonitorImpl.java:109) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java: > 511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRunAsync( > AkkaRpcActor.java:397) > at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage( > AkkaRpcActor.java:190) > at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor > .handleRpcMessage(FencedAkkaRpcActor.java:74) > at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage( > AkkaRpcActor.java:152) > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26) > at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21) > at scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:170) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171) > at akka.actor.Actor$class.aroundReceive(Actor.scala:517) > at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592) > at akka.actor.ActorCell.invoke(ActorCell.scala:561) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258) > at akka.dispatch.Mailbox.run(Mailbox.scala:225) > at akka.dispatch.Mailbox.exec(Mailbox.scala:235) > at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool > .java:1339) > at > akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread > .java:107) > > Any suggestions, please share with me. > > Regards, > Nagireddy Y >