I am unable to restore a 1.9 savepoint into a 1.11 runtime for the very interesting reason that the Savepoint class was renamed and repackaged between those two releases. Apparently a Kryo serializer has that class registered in the 1.9 runtime. I can’t think of a good reason for that class to be registered with Kryo; none of the job operators reference any such thing. Yet there it is causing the following exception and preventing upgrade to a new runtime.
Caused by: java.lang.IllegalStateException: Missing value for the key 'org.apache.flink.runtime.checkpoint.savepoint.Savepoint' at org.apache.flink.util.LinkedOptionalMap.unwrapOptionals(LinkedOptionalMap.java:190) ~[flink-dist_2.11-1.11.3.jar:1.11.3] at org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshot.restoreSerializer(KryoSerializerSnapshot.java:86) ~[flink-dist_2.11-1.11.3.jar:1.11.3] There doesn’t seem to be any way to unregister a class from Kryo. And the mechanism for dealing with missing classes looks to me like it has never worked as advertised. Instead of registering a dummy class for a missing class name a null gets registered instead, leading to the exception which prevents restoring the savepoint. The code that returns a null instead of a dummy is here - https://github.com/apache/flink/blob/e8cfe6701b9768d1f1fe4488640cba5f9b42d73f/flink-core/src/main/java/org/apache/flink/api/java/typeutils/runtime/kryo/KryoSerializerSnapshotData.java#L263 Resulting in this log. 2021-07-27 18:38:11,703 WARN org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializerSnapshotData [] - Cannot find registered class org.apache.flink.runtime.checkpoint.savepoint.Savepoint for Kryo serialization in classpath; using a dummy class as a placeholder. java.lang.ClassNotFoundException: org.apache.flink.runtime.checkpoint.savepoint.Savepoint One way or another I need to be able to restore a 1.9 savepoint into 1.11. Perhaps the Kryo registration needs to be cleansed from wherever it is lurking in the 1.9 savepoint, or an effective dummy needs to be substituted when reading it into 1.11. Has anyone else encountered this problem, or have any advice to offer?