Hey Ravinder, can you please share the JobManager logs as well?
The logs say that the TaskManager disconnects from the JobManager, because that one is not reachable anymore. At this point, the running shuffles are cancelled and you see the follow up RemoteTransportExceptions. – Ufuk On Mon, Mar 21, 2016 at 5:42 PM, Ravinder Kaur <neetu0...@gmail.com> wrote: > Hello All, > > I'm running the WordCount example streaming job and it fails because of loss > of Taskmanagers. > > When gone through the logs of the taskmanager it has the following messages > > 15:14:26,592 INFO org.apache.flink.streaming.runtime.tasks.StreamTask > - State backend is set to heap memory (checkpoint to jobmanager) > 15:14:26,703 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (15/25) switched to RUNNING > 15:14:26,709 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (17/25) switched to RUNNING > 15:14:26,712 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (16/25) switched to RUNNING > 15:14:26,714 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (18/25) switched to RUNNING > 15:27:29,356 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (15/25) switched to FAILED with > exception. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'vm-10-155-208-135.cloud.mwn.de/10.155.208.135:37028'. This might indicate > that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:119) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:306) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:828) > at > io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:621) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:358) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at java.lang.Thread.run(Thread.java:745) > 15:27:29,401 INFO org.apache.flink.runtime.taskmanager.TaskManager > - TaskManager akka://flink/user/taskmanager disconnects from JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager: JobManager is no > longer reachable > 15:27:29,384 WARN akka.remote.RemoteWatcher > - Detected unreachable: [akka.tcp://flink@10.155.208.156:6123] > 15:27:29,356 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (18/25) switched to FAILED with > exception. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'vm-10-155-208-135.cloud.mwn.de/10.155.208.135:37028'. This might indicate > that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:119) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:306) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:828) > at > io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:621) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:358) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at java.lang.Thread.run(Thread.java:745) > 15:27:29,356 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (16/25) switched to FAILED with > exception. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'vm-10-155-208-135.cloud.mwn.de/10.155.208.135:37028'. This might indicate > that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:119) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:306) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:828) > at > io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:621) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:358) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at java.lang.Thread.run(Thread.java:745) > 15:27:29,356 INFO org.apache.flink.runtime.taskmanager.Task > - Keyed Aggregation -> Sink: Unnamed (17/25) switched to FAILED with > exception. > org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: > Connection unexpectedly closed by remote task manager > 'vm-10-155-208-135.cloud.mwn.de/10.155.208.135:37028'. This might indicate > that the remote task manager was lost. > at > org.apache.flink.runtime.io.network.netty.PartitionRequestClientHandler.channelInactive(PartitionRequestClientHandler.java:119) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.ChannelInboundHandlerAdapter.channelInactive(ChannelInboundHandlerAdapter.java:75) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:306) > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:208) > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:194) > at > io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:828) > at > io.netty.channel.AbstractChannel$AbstractUnsafe$7.run(AbstractChannel.java:621) > at > io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:358) > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357) > at > io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:112) > at java.lang.Thread.run(Thread.java:745) > 15:27:29,518 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Cancelling all computations and discarding all cached data. > 15:27:29,525 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Keyed Aggregation -> Sink: Unnamed > (17/25) > 15:27:29,525 INFO org.apache.flink.runtime.taskmanager.Task > - Task Keyed Aggregation -> Sink: Unnamed (17/25) is already in state FAILED > 15:27:29,525 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Source: Read Text File Source -> Flat > Map (20/25) > 15:27:29,525 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Read Text File Source -> Flat Map (20/25) switched to FAILED with > exception. > java.lang.Exception: TaskManager akka://flink/user/taskmanager disconnects > from JobManager akka.tcp://flink@10.155.208.156:6123/user/jobmanager: > JobManager is no longer reachable > at > org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826) > at > org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) > at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > at > org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at > akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) > at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) > at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) > at akka.actor.ActorCell.invoke(ActorCell.scala:486) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 15:27:29,526 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Keyed Aggregation -> Sink: Unnamed (15/25) > 15:27:29,525 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Keyed Aggregation -> Sink: Unnamed (17/25) > 15:27:29,526 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Keyed Aggregation -> Sink: Unnamed (16/25) > 15:27:29,527 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Keyed Aggregation -> Sink: Unnamed (18/25) > 15:27:29,528 INFO org.apache.flink.runtime.taskmanager.Task > - Triggering cancellation of task code Source: Read Text File Source -> Flat > Map (20/25) (9cd65cbf15aafe0241b9d59e48b79f52). > 15:27:29,531 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Source: Read Text File Source -> Flat > Map (6/25) > 15:27:29,531 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Read Text File Source -> Flat Map (6/25) switched to FAILED with > exception. > java.lang.Exception: TaskManager akka://flink/user/taskmanager disconnects > from JobManager akka.tcp://flink@10.155.208.156:6123/user/jobmanager: > JobManager is no longer reachable > at > org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826) > at > org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) > at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > at > org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at > akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) > at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) > at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) > at akka.actor.ActorCell.invoke(ActorCell.scala:486) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 15:27:29,532 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Source: Read Text File Source -> Flat Map > (20/25) > 15:27:29,535 INFO org.apache.flink.runtime.taskmanager.Task > - Triggering cancellation of task code Source: Read Text File Source -> Flat > Map (6/25) (2151d07383014506c5f1b0283b1986de). > 15:27:29,535 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Keyed Aggregation -> Sink: Unnamed > (15/25) > 15:27:29,535 INFO org.apache.flink.runtime.taskmanager.Task > - Task Keyed Aggregation -> Sink: Unnamed (15/25) is already in state FAILED > 15:27:29,535 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Source: Read Text File Source -> Flat > Map (13/25) > 15:27:29,535 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Read Text File Source -> Flat Map (13/25) switched to FAILED with > exception. > java.lang.Exception: TaskManager akka://flink/user/taskmanager disconnects > from JobManager akka.tcp://flink@10.155.208.156:6123/user/jobmanager: > JobManager is no longer reachable > at > org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826) > at > org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:33) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) > at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > at > org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at > akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) > at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) > at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) > at akka.actor.ActorCell.invoke(ActorCell.scala:486) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 15:27:29,536 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Source: Read Text File Source -> Flat Map > (6/25) > 15:27:29,538 INFO org.apache.flink.runtime.taskmanager.Task > - Triggering cancellation of task code Source: Read Text File Source -> Flat > Map (13/25) (a6a0f05e0362dbf0505c1439893e53e4). > 15:27:29,539 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Keyed Aggregation -> Sink: Unnamed > (16/25) > 15:27:29,539 INFO org.apache.flink.runtime.taskmanager.Task > - Task Keyed Aggregation -> Sink: Unnamed (16/25) is already in state FAILED > 15:27:29,539 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Source: Read Text File Source -> Flat > Map (24/25) > 15:27:29,539 INFO org.apache.flink.runtime.taskmanager.Task > - Source: Read Text File Source -> Flat Map (24/25) switched to FAILED with > exception. > java.lang.Exception: TaskManager akka://flink/user/taskmanager disconnects > from JobManager akka.tcp://flink@10.155.208.156:6123/user/jobmanager: > JobManager is no longer reachable > at > org.apache.flink.runtime.taskmanager.TaskManager.handleJobManagerDisconnect(TaskManager.scala:826) > at > org.apache.flink.runtime.taskmanager.TaskManager$$anonfun$handleMessage$1.applyOrElse(TaskManager.scala:297) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LeaderSessionMessageFilter$$anonfun$receive$1.applyOrElse(LeaderSessionMessageFilter.scala:44) > at > scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:36) > at > org.apache.flink.runtime.LogMessages$$anon$1.apply(LogMessages.scala:28) > at > scala.PartialFunction$class.applyOrElse(PartialFunction.scala:123) > at > org.apache.flink.runtime.LogMessages$$anon$1.applyOrElse(LogMessages.scala:28) > at akka.actor.Actor$class.aroundReceive(Actor.scala:465) > at > org.apache.flink.runtime.taskmanager.TaskManager.aroundReceive(TaskManager.scala:119) > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516) > at > akka.actor.dungeon.DeathWatch$class.receivedTerminated(DeathWatch.scala:46) > at akka.actor.ActorCell.receivedTerminated(ActorCell.scala:369) > at akka.actor.ActorCell.autoReceiveMessage(ActorCell.scala:501) > at akka.actor.ActorCell.invoke(ActorCell.scala:486) > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:254) > at akka.dispatch.Mailbox.run(Mailbox.scala:221) > at akka.dispatch.Mailbox.exec(Mailbox.scala:231) > at > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.pollAndExecAll(ForkJoinPool.java:1253) > at > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1346) > at > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > at > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > 15:27:29,541 INFO org.apache.flink.runtime.taskmanager.Task > - Triggering cancellation of task code Source: Read Text File Source -> Flat > Map (24/25) (96df0f3c0990c5a573c67a31a6207fa2). > 15:27:29,541 INFO org.apache.flink.runtime.taskmanager.Task > - Attempting to fail task externally Keyed Aggregation -> Sink: Unnamed > (18/25) > 15:27:29,541 INFO org.apache.flink.runtime.taskmanager.Task > - Task Keyed Aggregation -> Sink: Unnamed (18/25) is already in state FAILED > 15:27:29,542 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Source: Read Text File Source -> Flat Map > (24/25) > 15:27:29,543 INFO org.apache.flink.runtime.taskmanager.Task > - Freeing task resources for Source: Read Text File Source -> Flat Map > (13/25) > 15:27:29,551 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Disassociating from JobManager > 15:27:29,572 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has a UID that has been quarantined. Association aborted. > 15:27:29,582 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:27:29,587 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:27:29,623 INFO org.apache.flink.runtime.io.network.netty.NettyClient > - Successful shutdown (took 17 ms). > 15:27:29,625 INFO org.apache.flink.runtime.io.network.netty.NettyServer > - Successful shutdown (took 2 ms). > 15:27:29,640 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 1, timeout: > 500 milliseconds) > 15:27:30,157 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 2, timeout: > 1000 milliseconds) > 15:27:31,177 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 3, timeout: > 2000 milliseconds) > 15:27:33,193 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 4, timeout: > 4000 milliseconds) > 15:27:37,203 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 5, timeout: > 8000 milliseconds) > 15:27:37,218 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:27:45,215 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 6, timeout: > 16000 milliseconds) > 15:27:45,235 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:28:01,223 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 7, timeout: 30 > seconds) > 15:28:01,237 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:28:31,243 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 8, timeout: 30 > seconds) > 15:28:31,261 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:28:45,960 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:29:01,253 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 9, timeout: 30 > seconds) > 15:29:01,273 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:30:11,545 INFO org.apache.flink.runtime.taskmanager.TaskManager > - Trying to register at JobManager > akka.tcp://flink@10.155.208.156:6123/user/jobmanager (attempt 10, timeout: > 30 seconds) > 15:30:11,562 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > 15:30:25,958 WARN Remoting > - Tried to associate with unreachable remote address > [akka.tcp://flink@10.155.208.156:6123]. Address is now gated for 5000 ms, > all messages to this address will be delivered to dead letters. Reason: The > remote system has quarantined this system. No further associations to the > remote system are possible until this system is restarted. > > Could anyone explain what is the reason for the systems getting > disassociated and be quarantined? > > Kind Regards, > Ravinder Kaur >