HI, We recently upgraded our test environment to from flink 1.3.2 to flink 1.4.0.
We are using a high availability setup on the job manager. And now often when I go to the job details in the web ui the call will timeout and the following error will pop up in the job manager log akka.remote.MessageSerializer$SerializationException: Failed to serialize remote message [class org.apache.flink.runtime.messages.JobManagerMessages$JobFound] using serializer [class akka.serialization.JavaSerializer]. at akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:61) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:889) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:889) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:888) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointWriter.writeSend(Endpoint.scala:780) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointWriter$$anonfun$4.applyOrElse(Endpoint.scala:755) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.actor.Actor$class.aroundReceive(Actor.scala:502) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:446) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526) [flink-dist_2.11-1.4.0.jar:1.4.0] at akka.actor.ActorCell.invoke(ActorCell.scala:495) [flink-dist_2.11-1.4.0.jar:1.4.0] at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257) [flink-dist_2.11-1.4.0.jar:1.4.0] at akka.dispatch.Mailbox.run(Mailbox.scala:224) [flink-dist_2.11-1.4.0.jar:1.4.0] at akka.dispatch.Mailbox.exec(Mailbox.scala:234) [flink-dist_2.11-1.4.0.jar:1.4.0] at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [flink-dist_2.11-1.4.0.jar:1.4.0] at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [flink-dist_2.11-1.4.0.jar:1.4.0] at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [flink-dist_2.11-1.4.0.jar:1.4.0] at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [flink-dist_2.11-1.4.0.jar:1.4.0] Caused by: java.io.NotSerializableException: org.apache.flink.runtime.executiongraph.ExecutionGraph at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184) ~[na:1.8.0_131] at java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548) ~[na:1.8.0_131] at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509) ~[na:1.8.0_131] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432) ~[na:1.8.0_131] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178) ~[na:1.8.0_131] at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348) ~[na:1.8.0_131] at akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply$mcV$sp(Serializer.scala:321) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:321) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:321) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.serialization.JavaSerializer.toBinary(Serializer.scala:321) ~[flink-dist_2.11-1.4.0.jar:1.4.0] at akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:47) ~[flink-dist_2.11-1.4.0.jar:1.4.0] ... 17 common frames omitted I isolated it further, and it seems to be triggered by this call https://hostname/jobs/28076fffbcf7eab3f17900a54cc7c41d I cannot reproduce it on my local lapop without HA setup. Before I dig any deeper, has anyone already come across this ?