HI,

We recently upgraded our test environment to from flink 1.3.2 to flink
1.4.0.

We are using a high availability setup on the job manager. And now often
when I go to the job details in the web ui the call will timeout and the
following error will pop up in the job manager log


akka.remote.MessageSerializer$SerializationException: Failed to serialize
remote message [class
org.apache.flink.runtime.messages.JobManagerMessages$JobFound] using
serializer [class akka.serialization.JavaSerializer].
at akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:61)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at
akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:889)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at
akka.remote.EndpointWriter$$anonfun$serializeMessage$1.apply(Endpoint.scala:889)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.remote.EndpointWriter.serializeMessage(Endpoint.scala:888)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.remote.EndpointWriter.writeSend(Endpoint.scala:780)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.remote.EndpointWriter$$anonfun$4.applyOrElse(Endpoint.scala:755)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.actor.Actor$class.aroundReceive(Actor.scala:502)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:446)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:526)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.actor.ActorCell.invoke(ActorCell.scala:495)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:257)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.dispatch.Mailbox.run(Mailbox.scala:224)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.dispatch.Mailbox.exec(Mailbox.scala:234)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
[flink-dist_2.11-1.4.0.jar:1.4.0]
at
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
[flink-dist_2.11-1.4.0.jar:1.4.0]
Caused by: java.io.NotSerializableException:
org.apache.flink.runtime.executiongraph.ExecutionGraph
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1184)
~[na:1.8.0_131]
at
java.io.ObjectOutputStream.defaultWriteFields(ObjectOutputStream.java:1548)
~[na:1.8.0_131]
at java.io.ObjectOutputStream.writeSerialData(ObjectOutputStream.java:1509)
~[na:1.8.0_131]
at
java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1432)
~[na:1.8.0_131]
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
~[na:1.8.0_131]
at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:348)
~[na:1.8.0_131]
at
akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply$mcV$sp(Serializer.scala:321)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at
akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:321)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at
akka.serialization.JavaSerializer$$anonfun$toBinary$1.apply(Serializer.scala:321)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.serialization.JavaSerializer.toBinary(Serializer.scala:321)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
at akka.remote.MessageSerializer$.serialize(MessageSerializer.scala:47)
~[flink-dist_2.11-1.4.0.jar:1.4.0]
... 17 common frames omitted



I isolated it further, and it seems to be triggered by this call

https://hostname/jobs/28076fffbcf7eab3f17900a54cc7c41d

I cannot reproduce it on my local lapop without HA setup.
Before I dig any deeper, has anyone already come across this ?

Reply via email to