Hi,大佬们
Flink job经常不定期重启,看了异常日志基本都是下面这种,可以帮忙解释下什么原因吗?
2020-07-01 20:20:43.875 [flink-akka.actor.default-dispatcher-27] INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator
flink-akka.remote.default-remote-dispatcher-22 - Remoting shut down.
2020-07-01 20:20:43.875 [flink-akka.actor.default-dispatcher-27] INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator
flink-akka.remote.default-remote-dispatcher-22 - Remoting shut down.
2020-07-01 20:20:43.875 [flink-metrics-16] INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator
flink-metrics-akka.remote.default-remote-dispatcher-14 - Remoting shut
down.
2020-07-01 20:20:43.875 [flink-metrics-16] INFO
akka.remote.RemoteActorRefProvider$RemotingTerminator
flink-metrics-akka.remote.default-remote-dispatcher-14 - Remoting shut
down.
2020-07-01 20:20:43.891 [flink-metrics-16] INFO
org.apache.flink.runtime.rpc.akka.AkkaRpcService - Stopped Akka RPC
service.
2020-07-01 20:20:43.895 [flink-akka.actor.default-dispatcher-15] INFO
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Terminating
cluster entrypoint process YarnJobClusterEntrypoint with exit code 2.
java.util.concurrent.CompletionException:
akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/resourcemanager#-781959047]] after [10000
ms]. Message of type
[org.apache.flink.runtime.rpc.messages.LocalFencedMessage]. A typical
reason for `AskTimeoutException` is that the recipient actor didn't
send a reply.
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:593)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
at
org.apache.flink.runtime.concurrent.FutureUtils$1.onComplete(FutureUtils.java:871)
at akka.dispatch.OnComplete.internal(Future.scala:263)
at akka.dispatch.OnComplete.internal(Future.scala:261)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:191)
at akka.dispatch.japi$CallbackBridge.apply(Future.scala:188)
at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:32)
at
org.apache.flink.runtime.concurrent.Executors$DirectExecutionContext.execute(Executors.java:74)
at
scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:40)
at
scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:248)
at
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:644)
at akka.actor.Scheduler$$anon$4.run(Scheduler.scala:205)
at
scala.concurrent.Future$InternalCallbackExecutor$.unbatchedExecute(Future.scala:599)
at
scala.concurrent.BatchingExecutor$class.execute(BatchingExecutor.scala:109)
at
scala.concurrent.Future$InternalCallbackExecutor$.execute(Future.scala:597)
at
akka.actor.LightArrayRevolverScheduler$TaskHolder.executeTask(LightArrayRevolverScheduler.scala:328)
at
akka.actor.LightArrayRevolverScheduler$$anon$4.executeBucket$1(LightArrayRevolverScheduler.scala:279)
at
akka.actor.LightArrayRevolverScheduler$$anon$4.nextTick(LightArrayRevolverScheduler.scala:283)
at
akka.actor.LightArrayRevolverScheduler$$anon$4.run(LightArrayRevolverScheduler.scala:235)
at java.lang.Thread.run(Thread.java:745)
Caused by: akka.pattern.AskTimeoutException: Ask timed out on
[Actor[akka://flink/user/resourcemanager#-781959047]] after [10000
ms]. Message of type
[org.apache.flink.runtime.rpc.messages.LocalFencedMessage]. A typical
reason for `AskTimeoutException` is that the recipient actor didn't
send a reply.
at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635)
at akka.pattern.PromiseActorRef$$anonfun$2.apply(AskSupport.scala:635)
at
akka.pattern.PromiseActorRef$$anonfun$1.apply$mcV$sp(AskSupport.scala:648)
... 9 common frames omitted