Hi, we have an error when Flink 1.20 resume a job from a savepoint that was 
written with Flink 1.19 (We are in an upgrade scenario here). And we hit the 
error `Caused by: java.lang.ClassNotFoundException: 
org.apache.flink.runtime.jobgraph.RestoreMode`, see below

So I get that `org.apache.flink.runtime.jobgraph.RestoreMode#LEGACY` was 
deprecated in Flink 1.19, but the whole class has been remove in Flink 1.20. I 
don't think we ever serialized that class in our code, but sommehow the sjob 
recovery needs it. I can only suppose that was part of the serialization in the 
savepoint. Did I miss a step somwwhere od there is an issue here?

I see a similar issue in Flink-CDC here: 
https://issues.apache.org/jira/browse/FLINK-36105, but I do not think we use 
this (unless it is internally via other dependencies).

Any thoughts?

Kind Regards

JM

```
java.util.concurrent.CompletionException: 
org.apache.flink.util.FlinkRuntimeException: Could not recover job with job id 
2c5240ef004943b94e752c06eade10bd.
  at 
java.base/java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:315)
  at 
java.base/java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:320)
  at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:649)
  at 
java.base/java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:482)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
  at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
  at java.base/java.lang.Thread.run(Thread.java:857)
Caused by: org.apache.flink.util.FlinkRuntimeException: Could not recover job 
with job id 2c5240ef004943b94e752c06eade10bd.
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.tryRecoverJob(SessionDispatcherLeaderProcess.java:183)
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobs(SessionDispatcherLeaderProcess.java:150)
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.lambda$recoverJobsIfRunning$2(SessionDispatcherLeaderProcess.java:139)
  at 
org.apache.flink.runtime.dispatcher.runner.AbstractDispatcherLeaderProcess.supplyUnsynchronizedIfRunning(AbstractDispatcherLeaderProcess.java:198)
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.recoverJobsIfRunning(SessionDispatcherLeaderProcess.java:139)
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.lambda$createDispatcherBasedOnRecoveredJobGraphsAndRecoveredDirtyJobResults$1(SessionDispatcherLeaderProcess.java:129)
  at 
java.base/java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:646)
  ... 4 more
Caused by: org.apache.flink.util.FlinkException: Could not retrieve submitted 
JobGraph from state handle under /2c5240ef004943b94e752c06eade10bd. This 
indicates that you are trying to recover from state written by an older Flink 
version which is not compatible. Try cleaning the state handle store.
  at 
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore.recoverJobGraph(DefaultJobGraphStore.java:170)
  at 
org.apache.flink.runtime.dispatcher.runner.SessionDispatcherLeaderProcess.tryRecoverJob(SessionDispatcherLeaderProcess.java:174)
  ... 10 more
Caused by: java.lang.ClassNotFoundException: 
org.apache.flink.runtime.jobgraph.RestoreMode
  at java.base/java.lang.Class.forNameImpl(Native Method)
  at java.base/java.lang.Class.forName(Class.java:429)
  at 
org.apache.flink.util.InstantiationUtil$ClassLoaderObjectInputStream.resolveClass(InstantiationUtil.java:78)
  at 
java.base/java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:2153)
  at 
java.base/java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:2017)
  at java.base/java.io.ObjectInputStream.readEnum(ObjectInputStream.java:2295)
  at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1846)
  at 
java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2732)
  at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2581)
  at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2376)
  at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1852)
  at 
java.base/java.io.ObjectInputStream$FieldValues.<init>(ObjectInputStream.java:2732)
  at 
java.base/java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2581)
  at 
java.base/java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2376)
  at 
java.base/java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1852)
  at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:591)
  at java.base/java.io.ObjectInputStream.readObject(ObjectInputStream.java:501)
  at 
org.apache.flink.util.InstantiationUtil.deserializeObject(InstantiationUtil.java:533)
  at 
org.apache.flink.runtime.state.RetrievableStreamStateHandle.retrieveState(RetrievableStreamStateHandle.java:59)
  at 
org.apache.flink.runtime.jobmanager.DefaultJobGraphStore.recoverJobGraph(DefaultJobGraphStore.java:168)
  ... 11 more
  ```


Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6 3AU

Reply via email to