Today, I kept on receiving a timeout exception when stopping my job with a
savepoint.
This happened with Flink version 1.12.2 running in EMR.

I had to use the deprecated cancel with savepoint feature instead.

In fact, stopping with a savepoint, creating a savepoint, and cancelling
with a savepoint all gave me the timeout exception.

However, the cancel with savepoint started creating a savepoint on the
cluster.

The program finished with the following exception:

org.apache.flink.util.FlinkException: Could not stop with a savepoint job
"5d6100984035db9541e9f08ecbd311bf".
at
org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:585)
at
org.apache.flink.client.cli.CliFrontend.runClusterAction(CliFrontend.java:1006)
at org.apache.flink.client.cli.CliFrontend.stop(CliFrontend.java:573)
at
org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1073)
at
org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1136)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at
org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1136)
Caused by: java.util.concurrent.TimeoutException
at
java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1784)
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1928)
at
org.apache.flink.client.cli.CliFrontend.lambda$stop$5(CliFrontend.java:583)
... 9 more

Reply via email to